Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

dc.contributor.author Andorf, Carson
dc.contributor.author Dobbs, Drena
dc.contributor.author Honavar, Vasant
dc.contributor.author Dobbs, Drena
dc.contributor.department Computer Science
dc.contributor.department Genetics, Development and Cell Biology
dc.contributor.department Bioinformatics and Computational Biology
dc.date 2018-02-18T04:53:02.000
dc.date.accessioned 2020-06-30T04:01:02Z
dc.date.available 2020-06-30T04:01:02Z
dc.date.copyright Mon Jan 01 00:00:00 UTC 2007
dc.date.issued 2007-01-01
dc.description.abstract <p><h3>Background</h3></p> <p>Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. <h3>Results</h3></p> <p>In a set of 211 previously annotated mouse protein kinases, we found that 201 of the GO annotations returned by AmiGO appear to be <em>inconsistent</em> with the UniProt functions assigned to their human counterparts. In contrast, 97% of the predicted annotations generated using a machine learning approach were <em>consistent</em> with the UniProt annotations of the human counterparts, as well as with available annotations for these mouse protein kinases in the Mouse Kinome database. <h3>Conclusion</h3></p> <p>We conjecture that most of our predicted annotations are, therefore, correct and suggest that the machine learning approach developed here could be routinely used to detect potential errors in GO annotations generated by high-throughput gene annotation projects.</p>
dc.description.comments <p>This article is from <em>BMC Bioinformatics </em>8 (2007): 284, doi: <a href="http://dx.doi.org/10.1186/1471-2105-8-284" target="_blank">10.1186/1471-2105-8-284</a>. Posted with permission.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/gdcb_las_pubs/102/
dc.identifier.articleid 1107
dc.identifier.contextkey 9743038
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath gdcb_las_pubs/102
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/37765
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/gdcb_las_pubs/102/2007_Dobbs_ExploringInconsistencies.pdf|||Fri Jan 14 18:15:54 UTC 2022
dc.source.uri 10.1186/1471-2105-8-284
dc.subject.disciplines Bioinformatics
dc.subject.disciplines Computational Biology
dc.subject.disciplines Genetics and Genomics
dc.title Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
dc.type article
dc.type.genre article
dspace.entity.type Publication
relation.isAuthorOfPublication 7e096c4f-9007-41e4-9414-989c3ea9bc88
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
relation.isOrgUnitOfPublication 9e603b30-6443-4b8e-aff5-57de4a7e4cb2
relation.isOrgUnitOfPublication c331f825-0643-499a-9eeb-592c7b43b1f5
Original bundle
Now showing 1 - 1 of 1
1.05 MB
Adobe Portable Document Format