Prediction of Novel Pseudogenes in Ovine Reference Genome

Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction Pseudogenes are copies of the ancestral genes which have undergone changes that were constructed based on gene duplications and reverse transcription in the genome. They have been reported in all types of organisms ranging from bacteria to mammals. Pseudogenes increase the genetic diversity of a plethora of genes and they do so through gene conversion and recombination. Three classes of pseudogenes are known to exist: duplicated pseudogenes; processed or retrotransposed pseudogenes; and unitary or disabled pseudogenes. Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent studies reported that many of them might have some form of biological activity. Recently, it has reported that pseudogenes represent a conspicuous part of the human transcriptome and proteome, as thousands of them are transcribed and hundreds are also translated. Also, it has been demonstrated that pseudogenes exert important coding-dependent and coding-independent functions that are involved in complex regulatory networks. Hence, the possibility of functionality of these genes, has increased interest in their accurate annotation. According to the best of our knowledge, there is no available report on the high-throughput pseudogene identification in sheep. Therefore, in the present study, to improve the annotation of sheep genome, we present the first genome-wide pseudogene identification for protein-coding genes using a homology-based computational approach.
Materials and Methods The pseudogene content in the sheep genome was estimated using an in-house computational annotation pipeline, named PseudoPipe. The PseudoPipe pipeline predicts pseudogenes in the genome using homology-based method (BLAST and a clustering algorithm). In the present study, repeat-masked sheep genome reference (Ovis_aries.Oar_v3.1), genome annotation gtf file (version 77) and all of the protein coding genes sequences were downloaded from ENSEMBL database. To identify pseudogenes, the sheep genome was searched in a comprehensive and consistent manner. The key steps in the pipeline involved using BLAST to rapidly cross-reference potential ‘‘parent’’ proteins against the intergenic regions of the genome and then processing the resulting ‘‘raw hits’’ such as eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Then, pseudogenes were classified based on a combination of criteria including homology, intron/exon structure, and existence of stop codons and frameshifts. Finally, we investigated the results manually and false positive results were removed. Also, the gene ontology (GO) of the parental genes that pseudogenes derived from them, have been investigate by DAVID software. Furthermore, different characteristics of the identified new candidate pseudogene were compared with known pseudogenes in the human, mice and cattle species.
Results and Discussion It is vital to identify pseudogenes to better understand genome annotation and disease-related molecular mechanism. Identification of pseudogenes is an ongoing effort, and there are several groups continuously working on identification of pseudogenes. The complexity of the identification of pseudogenes can be addressed by in silico analysis and using a homology-based whole genome identification approach. Here, using a computational method, we identified 4,098 high confidence pseudogenes including 1,102 duplicated and 2,996 processed pseudogenes in sheep genome. The results of the GO analysis showed that identified pseudogenes are significantly enriched in various biological processes, such as mRNA splicing, ribosome structure, binding rRNA, mitochondrial electron transport, translation and etc. Interestingly, a growing body of evidence suggests parental genes of pseudogenes roles are associated with ribosome, rRNA and translational biological processes. Detailed comparison of our results with other species showed that our results are in consistence with previous studies. For example, pseudogene distribution on the sheep chromosomes was in consistence with human and mouse genome. Moreover, it is reported that, duplicated pseudogenes are commonly found on the same chromosome as their parent genes.
Our results showed that about 28% of the identified duplicated pseudogenes were on the same chromosome with their parent genes. The results of the study will help to improve the annotation of the sheep genome. The coincidence of the results of this study with previous studies indicates accuracy of the method used in this research.
Conclusion This study, for the first time, has generated the catalog of 4,098 sheep putative pseudogenes. Our findings provide an evidence for pseudogene content in sheep which is a starting point for understanding of their regulatory mechanism. The identification of the novel pseudogenes have greatly improved the genome annotation of sheep. The results of this study will help to better annotation of sheep genome. By using such methods, we can also improve annotation genomes of various organisms.
Language:
Persian
Published:
Iranian Journal of Animal Science Reaserch, Volume:9 Issue: 4, 2018
Pages:
484 to 497
magiran.com/p1820362  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!