Annotating and characterizing orphan gene in Zea mays via diverse RNA-Seq data

dc.contributor.advisor Wurtele, Eve
dc.contributor.advisor Yandeau-Nelson, Marna
dc.contributor.advisor Nikolau, Basil
dc.contributor.advisor Liu, Peng
dc.contributor.advisor Dorman, Karin
dc.contributor.author Li, Jing
dc.contributor.department Department of Genetics, Development, and Cell Biology (LAS)
dc.date.accessioned 2022-11-09T05:45:31Z
dc.date.available 2022-11-09T05:45:31Z
dc.date.embargo 2024-09-07T00:00:00Z
dc.date.issued 2022-08
dc.date.updated 2022-11-09T05:45:31Z
dc.description.abstract In the past, many studies have dismissed the pervasive transcribed but unannotated transcripts as transcriptional "noise", which I refer to "dark transcriptome". Some functional genes have been identified from this dark transcriptome. Most genes in the dark transcriptome are orphan genes. Orphan genes are the recently emerged young genes, which share no sequence similarity with proteins in any other species. In the last 20 years, thousands of orphan genes have been experimentally shown to play important roles in diverse species. However, there remain significant limitation in our knowledge about orphan genes and their function. The traditional gene model prediction pipeline is based on the ab initio method and sequence homology, which is lacking for the orphan genes. It is hard to predict orphan genes by traditional methods. Moreover, orphan genes usually only expressed in some specific conditions. Even though many de novo genes have been evaluated in several studies from multiple RNA-Seq evidence, the limited library conditions may restrict the identification of orphan genes. Currently, we have no idea about the true number of orphan genes in a genome. Gene function prediction is largely based on sequence and domain similarity, with a small set of gene function inferred directly from experimental evidence. Orphan gene function cannot be inferred via the traditional method. Even though some orphan genes have been experimentally characterized, most of their functions are not integrated in the public database. This dissertation presents methods and tools to evaluate potential orphan genes, and predict potential orphan genes and their function efficiently. First, I comprehensively evaluated all potential ORFs in yeast using over 3,000 RNA-Seq and Ribo-Seq samples for transcription and translation evidence. Next, I developed a light weight, flexible, reproducible, and well-documented pipeline, BIND, to improve orphan gene prediction. Finally, I provide improved gene model predictions using BIND, and comprehensive functional annotations using co-expression analysis from over 1,000 RNA-Seq samples for 26 inbred lines in Zea mays subsp. mays. The functional annotation was validated by enrichment analysis with differential expression analysis. Thousands of orphan genes showed specific expression in at least one stress condition and tissue. The annotation of pan-orphan genes, especially for the inbred line-specific genes in 26 NAM founder lines, hold potential to help agronomists and geneticists to use as molecular markers for marker-assisted selection and to develop desired varieties for maize.
dc.format.mimetype PDF
dc.identifier.orcid 0000-0003-0761-2977
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/4vGXMKmr
dc.language.iso en
dc.language.rfc3066 en
dc.subject.disciplines Genetics en_US
dc.subject.keywords gene prediction en_US
dc.subject.keywords orphan gene en_US
dc.subject.keywords RNA-Seq en_US
dc.subject.keywords Zea mays en_US
dc.title Annotating and characterizing orphan gene in Zea mays via diverse RNA-Seq data
dc.type dissertation en_US
dc.type.genre dissertation en_US
dspace.entity.type Publication
relation.isOrgUnitOfPublication 9e603b30-6443-4b8e-aff5-57de4a7e4cb2
thesis.degree.discipline Genetics en_US
thesis.degree.grantor Iowa State University en_US
thesis.degree.level dissertation $
thesis.degree.name Doctor of Philosophy en_US
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Li_iastate_0097E_20288.pdf
Size:
18.44 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: