Biclustering methods and a Bayesian approach to fitting Boltzmann machines in statistical learning

dc.contributor.advisor Stephen B. Vardeman
dc.contributor.author Li, Jing
dc.contributor.department Statistics (LAS)
dc.date 2018-08-11T14:37:49.000
dc.date.accessioned 2020-06-30T02:54:52Z
dc.date.available 2020-06-30T02:54:52Z
dc.date.copyright Wed Jan 01 00:00:00 UTC 2014
dc.date.embargo 2001-01-01
dc.date.issued 2014-01-01
dc.description.abstract <p>This disertation focuses on two topics in Statistical Learning. One is biclustering, and the other is deep learning. The whole dissertation has three chapters, where Chapter 1 and 2 focus on biclustering; Chapter 3 focuses on the deep learning topic.</p> <p>Biclustering is a Statistical Learning technique that simultaneously partitions the set of samples and the set of their attributes into homogeneous subsets. In Chapter 1, motivated by movie rating data, we firstly propose a Bayesian model and an MCMC algorithm for model estimation. Because this algorithm is too slow to be of practical use with current computation power, we next propose a simplified model and design a Genetic Algorithm for maximizing the likelihood function. This approach works well on a small data set. However, due to the NP-hard nature of the problem, both approaches fail to be practically useful with current computation power. Nonetheless, they provide principled ways of solving a biclustering problem for future use as computation power develops.</p> <p>Also motivated by movie rating data, where missing values need to be addressed, in Chapter 2, we propose a new Prototype-based Biclustering method. We evaluate our method on test cases with various percentages with missing values in terms of the Rand Index between our result and the "true" partitions. In fact, our method has good performance on test cases even with a large missing value percentage. We further evaluate our method on a gene expression data set, that contains no missing values. Our method outperforms an existing biclustering method, i.e., Spectral Biclustering, using the Mean Squared Error criterion.</p> <p>Deep Learning is a Statistical Learning topic, which involves a "deep" network architecture mimicing the information representation structure in human brain. In Chapter 3, motivated by a hand-written digit classification problem, we propose a Bayesian framework for fitting Boltzmann machine models. The proposed approach surpasses the previous available methods in terms of fitting because it provides a principled fitting method using an MCMC algorithm. The approach presented here also provides a reasonably effective way to extract features from multivariate data for use in classification.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/14173/
dc.identifier.articleid 5180
dc.identifier.contextkey 7766107
dc.identifier.doi https://doi.org/10.31274/etd-180810-3739
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/14173
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/28359
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/14173/0-5_20_13.R|||Fri Jan 14 20:15:30 UTC 2022
dc.source.bitstream archive/lib.dr.iastate.edu/etd/14173/1-gibbsBM.c|||Fri Jan 14 20:15:29 UTC 2022
dc.source.bitstream archive/lib.dr.iastate.edu/etd/14173/2-gibbsRBM.c|||Fri Jan 14 20:15:30 UTC 2022
dc.source.bitstream archive/lib.dr.iastate.edu/etd/14173/3-gibbsRBM2layer.c|||Fri Jan 14 20:15:29 UTC 2022
dc.source.bitstream archive/lib.dr.iastate.edu/etd/14173/Li_iastate_0097E_14605.pdf|||Fri Jan 14 20:15:32 UTC 2022
dc.subject.disciplines Statistics and Probability
dc.subject.keywords Statistics
dc.supplemental.bitstream 5_20_13.R
dc.supplemental.bitstream gibbsBM.c
dc.supplemental.bitstream gibbsRBM.c
dc.supplemental.bitstream gibbsRBM2layer.c
dc.title Biclustering methods and a Bayesian approach to fitting Boltzmann machines in statistical learning
dc.type dissertation
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 5 of 5
No Thumbnail Available
Name:
Li_iastate_0097E_14605.pdf
Size:
720.89 KB
Format:
Adobe Portable Document Format
Description:
No Thumbnail Available
Name:
0-5_20_13.R
Size:
23.91 KB
Format:
Unknown data format
Description:
No Thumbnail Available
Name:
2-gibbsRBM.c
Size:
5.39 KB
Format:
Unknown data format
Description:
No Thumbnail Available
Name:
1-gibbsBM.c
Size:
16.41 KB
Format:
Unknown data format
Description:
No Thumbnail Available
Name:
3-gibbsRBM2layer.c
Size:
9.46 KB
Format:
Unknown data format
Description: