Algorithms for hierarchical clustering of gene expression data

dc.contributor.author Komarina, Srikanth
dc.contributor.department Electrical and Computer Engineering
dc.date 2020-11-09T01:16:48.000
dc.date.accessioned 2021-02-26T08:56:46Z
dc.date.available 2021-02-26T08:56:46Z
dc.date.copyright Thu Jan 01 00:00:00 UTC 2004
dc.date.issued 2004-01-01
dc.description.abstract <p>Genes are parts of the genome which encode for proteins in an organism. Proteins play an important part in many biologicl processes in any organism. Measuring expression level of a gene helps biologists estimate the amount of protein produced by that gene. Microarrays can be used to measure the expression levels of thousands of genes in a single experiment. Using additional techniques such as clustering various correlations among genes of interest can be found. The most commonly used clustering technique for microarray data analysis is hierarchical clustering. Various metrics such ad Euclidean, Manhattan, Pearson correlation coefficient have been used to measure (dis)similarity between genes. A commonly used software for hierarchical clustering based on Pearson correlation coefficient takes O(N[Arrow pointing up]3) for clustering N genes, even though there are algorithms which can reduce the runtime to O(N[Arrow pointing up]2). In this thesis, we show how the runtime can be reduced to O(N log N) by using a geometric interpretation of the Pearson correlation coeffcient and show that it is optimal.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/rtd/20679/
dc.identifier.articleid 21678
dc.identifier.contextkey 20115128
dc.identifier.doi https://doi.org/10.31274/rtd-20201107-236
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath rtd/20679
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/98046
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/rtd/20679/Komarina_ISU_2004_K66.pdf|||Fri Jan 14 22:26:59 UTC 2022
dc.subject.keywords Electrical and computer engineering
dc.subject.keywords Computer engineering
dc.title Algorithms for hierarchical clustering of gene expression data
dc.type article
dc.type.genre thesis
dspace.entity.type Publication
relation.isOrgUnitOfPublication a75a044c-d11e-44cd-af4f-dab1d83339ff
thesis.degree.discipline Computer Engineering
thesis.degree.level thesis
thesis.degree.name Master of Science
File
Original bundle
Now showing 1 - 1 of 1
Name:
Komarina_ISU_2004_K66.pdf
Size:
540.91 KB
Format:
Adobe Portable Document Format
Description: