Some Bayes methods for biclustering and vector data with binary coordinates

Chakraborty, Abhishek

Some Bayes methods for biclustering and vector data with binary coordinates

File

Chakraborty_iastate_0097E_18226.pdf (10.74 MB)

Date

2019-01-01

Authors

Chakraborty, Abhishek

Advisor

Stephen B. Vardeman

Abstract

We consider Bayes methods for two problems that share a common need to partition index sets encoding commonalities between observations. The first is a biclustering problem. The second is inference for mixture models for $p$-vectors with binary coordinates.

Standard one-way clustering methods form homogeneous groups in a set of objects. Biclustering methods simultaneously cluster rows and columns of a rectangular dataset in such a way that responses are homogeneous for all row-cluster by column-cluster groups. Assuming that data entries follow a normal distribution with a bicluster-specific mean term and a common variance, we propose a Bayes methodology for biclustering and corresponding Markov Chain Monte Carlo (MCMC) algorithms. Our proposed method not only identifies homogeneous biclusters, but also generates plausible predictions for missing/unobserved entries in the potential rectangular dataset as illustrated through simulation studies and applications to real datasets.

In the second problem, we propose a tractable symmetric distribution for modeling multivariate vectors of 0's and 1's on $p$ dimensions that allows for nontrivial amounts of variation around some central value. We then consider Bayesian analysis of mixture models where the component distributions have this above form. Inferences are made from the posterior samples generated by MCMC algorithms. We also extend our proposed Bayesian mixture model analysis to datasets with missing entries. Model performance is illustrated through simulation studies and applications to real datasets.

Academic or Administrative Unit

Statistics (LAS)

Type

dissertation

Copyright

Thu Aug 01 00:00:00 UTC 2019