Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

dc.contributor.author Agrawal, Ankit
dc.contributor.author Huang, Xiaoqiu
dc.contributor.department Department of Computer Science
dc.date 2019-02-25T10:06:02.000
dc.date.accessioned 2020-06-30T01:54:40Z
dc.date.available 2020-06-30T01:54:40Z
dc.date.copyright Thu Jan 01 00:00:00 UTC 2009
dc.date.embargo 2019-02-14
dc.date.issued 2009-01-01
dc.description.abstract <p>Background: Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise statistical significance with database statistical significance was conducted. In this paper, we extend the earlier work on pairwise statistical significance by incorporating with it the use of multiple parameter sets.</p> <p>Results: Results for a knowledge discovery application of homology detection reveal that using multiple parameter sets for pairwise statistical significance estimates gives better coverage than using a single parameter set, at least at some error levels. Further, the results of pairwise statistical significance using multiple parameter sets are shown to be significantly better than database statistical significance estimates reported by BLAST and PSI-BLAST, and comparable and at times significantly better than SSEARCH. Using non-zero parameter set change penalty values give better performance than zero penalty.</p> <p>Conclusion: The fact that the homology detection performance does not degrade when using multiple parameter sets is a strong evidence for the validity of the assumption that the alignment score distribution follows an extreme value distribution even when using multiple parameter sets. Parameter set change penalty is a useful parameter for alignment using multiple parameter sets. Pairwise statistical significance using multiple parameter sets can be effectively used to determine the relatedness of a (or a few) pair(s) of sequences without performing a time-consuming database search.</p>
dc.description.comments <p>This proceeding was published as Agrawal, Ankit, and Xiaoqiu Huang. "Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty." In BMC Bioinformatics 10 (2009): S1, doi: <a href="https://doi.org/10.1186/1471-2105-10-S3-S1">10.1186/1471-2105-10-S3-S1</a>. From Second International Workshop on Data and Text Mining in Bioinformatics (DTMBio) 2008 Napa Valley, CA, USA. 30 October 2008.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/cs_conf/47/
dc.identifier.articleid 1046
dc.identifier.contextkey 13823659
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath cs_conf/47
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/19855
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/cs_conf/47/2009_Huang_PairwiseStatistical.pdf|||Sat Jan 15 00:25:10 UTC 2022
dc.source.uri 10.1186/1471-2105-10-S3-S1
dc.subject.disciplines Bioinformatics
dc.subject.disciplines Computer Sciences
dc.subject.disciplines Genetics and Genomics
dc.subject.disciplines Statistical Methodology
dc.title Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty
dc.type article
dc.type.genre conference
dspace.entity.type Publication
relation.isAuthorOfPublication e5367231-5ba9-43f5-b3e4-e3e742211b2e
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2009_Huang_PairwiseStatistical.pdf
Size:
640.36 KB
Format:
Adobe Portable Document Format
Description: