Insights into the rice and Arabidopsis genomes: intron fates, paralogs, and lineage-specific genes

Lin, Haining
Journal Title
Journal ISSN
Volume Title
Source URI
Research Projects
Organizational Units
Journal Issue

With the availability of near-complete rice genome sequence,

high-quality annotation data, and large expression profile datasets, we examined

segmental duplication, intron turnover, and paralogous protein family

composition in rice. These data suggest a large percentage of the rice genome

was involved in segmental duplication creating a large number of paralogous

families. We found that singleton and paralogous family genes differed

substantially not only in their likelihood of encoding a protein of known or

putative function but also in the distribution of specific gene function. We

showed that a significant portion of the duplicated genes in rice show divergent

expression although a correlation between sequence divergence and correlation of

expression could be seen in very young genes. We observed that intron evolution

within the rice genome following segmental duplication is dominated by intron

loss rather than intron gain. In addition, with the availability of more

complete or near-complete plant genomes and transcriptomes across a wide range

of species, we identified and characterized conserved Brassicaceae-specific

genes and Arabidopsis lineage-specific genes. Lineage specific genes in the

Brassicaceae and within Arabidopsis were enriched in genes of no known function

and appear to be fast evolving at the protein sequence level.