Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data

dc.contributor.author Beiki, Hamid
dc.contributor.author Liu, Haibo
dc.contributor.author Huang, J.
dc.contributor.author Manchanda, Nancy
dc.contributor.author Nonneman, D.
dc.contributor.author Smith, T. P. L.
dc.contributor.author Reecy, James
dc.contributor.author Tuggle, Christopher
dc.contributor.department Department of Animal Science
dc.contributor.department Department of Ecology, Evolution, and Organismal Biology (LAS)
dc.date 2019-07-04T23:59:28.000
dc.date.accessioned 2020-06-29T23:41:02Z
dc.date.available 2020-06-29T23:41:02Z
dc.date.issued 2019-01-01
dc.description.abstract <p>Background: Our understanding of the pig transcriptome is limited. RNA transcript diversity among nine tissues was assessed using poly(A) selected single-molecule long-read isoform sequencing (Iso-seq) and Illumina RNA sequencing (RNA-seq) from a single White cross-bred pig.</p> <p>Results: Across tissues, a total of 67,746 unique transcripts were observed, including 60.5% predicted proteincoding, 36.2% long non-coding RNA and 3.3% nonsense-mediated decay transcripts. On average, 90% of the splice junctions were supported by RNA-seq within tissue. A large proportion (80%) represented novel transcripts, mostly produced by known protein-coding genes (70%), while 17% corresponded to novel genes. On average, four transcripts per known gene (tpg) were identified; an increase over current EBI (1.9 tpg) and NCBI (2.9 tpg) annotations and closer to the number reported in human genome (4.2 tpg). Our new pig genome annotation extended more than 6000 known gene borders (5′ end extension, 3′ end extension, or both) compared to EBI or NCBI annotations. We validated a large proportion of these extensions by independent pig poly(A) selected 3′-RNAseq data, or human FANTOM5 Cap Analysis of Gene Expression data. Further, we detected 10,465 novel genes (81% non-coding) not reported in current pig genome annotations. More than 80% of these novel genes had transcripts detected in > 1 tissue. In addition, more than 80% of novel intergenic genes with at least one transcript detected in liver tissue had H3K4me3 or H3K36me3 peaks mapping to their promoter and gene body, respectively, in independent liver chromatin immunoprecipitation data.</p> <p>Conclusions: These validated results show significant improvement over current pig genome annotations.</p>
dc.description.comments <p>This article is published as Beiki, H., H. Liu, J. Huang, N. Manchanda, D. Nonneman, T. P. L. Smith, J. M. Reecy, and C. K. Tuggle. "Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data." <em>BMC Genomics</em> 20 (2019): 344. doi: <a href="https://doi.org/10.1186/s12864-019-5709-y">10.1186/s12864-019-5709-y</a>.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/ans_pubs/465/
dc.identifier.articleid 1465
dc.identifier.contextkey 14484679
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath ans_pubs/465
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/9896
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/ans_pubs/465/2019_Reecy_ImprovedAnnotation.pdf|||Sat Jan 15 00:23:39 UTC 2022
dc.source.uri 10.1186/s12864-019-5709-y
dc.subject.disciplines Agriculture
dc.subject.disciplines Animal Sciences
dc.subject.keywords Porcine
dc.subject.keywords Transcriptome sequencing
dc.subject.keywords PacBio
dc.subject.keywords Iso-seq
dc.subject.keywords Single molecule long read sequencing
dc.subject.keywords RNA-seq
dc.subject.keywords Genome annotation
dc.title Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data
dc.type article
dc.type.genre article
dspace.entity.type Publication
relation.isAuthorOfPublication fb994cd9-94d5-4370-94ab-f33934c4cd6f
relation.isOrgUnitOfPublication 85ecce08-311a-441b-9c4d-ee2a3569506f
relation.isOrgUnitOfPublication fb57c4c9-fba7-493f-a416-7091a6ecedf1
File