Variant analysis pipeline for accurate detection of genomic variants from transcriptome sequencing data
Is Version Of
The wealth of information deliverable from transcriptome sequencing (RNA-seq) is significant, however current applications for variant detection still remain a challenge due to the complexity of the transcriptome. Given the ability of RNA-seq to reveal active regions of the genome, detection of RNA-seq SNPs can prove valuable in understanding the phenotypic diversity between populations. Thus, we present a novel computational workflow named VAP (Variant Analysis Pipeline) that takes advantage of multiple RNA-seq splice aware aligners to call SNPs in non-human models using RNA-seq data only. We applied VAP to RNA-seq from a highly inbred chicken line and achieved high accuracy when compared with the matching whole genome sequencing (WGS) data. Over 65% of WGS coding variants were identified from RNA-seq. Further, our results discovered SNPs resulting from post transcriptional modifications, such as RNA editing, which may reveal potentially functional variation that would have otherwise been missed in genomic data. Even with the limitation in detecting variants in expressed regions only, our method proves to be a reliable alternative for SNP identification using RNA-seq data. The source code and user manuals are available at https://modupeore.github.io/VAP/.
This article is published as Adetunji, Modupeore O., Susan J. Lamont, Behnam Abasht, and Carl J. Schmidt. "Variant analysis pipeline for accurate detection of genomic variants from transcriptome sequencing data." PloS ONE 14, no. 9 (2019): e0216838. DOI: 10.1371/journal.pone.0216838. Posted with permission.