Factor models for big data

dc.contributor.advisor Somak Dutta
dc.contributor.author Dai, Fan
dc.contributor.department Statistics
dc.date 2021-01-16T18:20:01.000
dc.date.accessioned 2021-02-25T21:38:03Z
dc.date.available 2021-02-25T21:38:03Z
dc.date.copyright Tue Dec 01 00:00:00 UTC 2020
dc.date.embargo 2023-01-07
dc.date.issued 2020-01-01
dc.description.abstract <p>This dissertation is motivated by clustering dendritic spines which have attracted interest in neuroscience because the morphology of spines are closely related to brain functionality. However, modeling and analyzing the morphological data is challenging because they involve both directional and non-directional features and there is very little work available on characterizing the dependence among these features in a practically useful manner. In fact, there are very few methods available for modeling the dependence among directional components. Thus, in this collection of works, we present novel methodologies, matrix-free algorithms and real-world applications for modeling and illustrating the variability of data on a high-dimensional sphere and clustered multivariate data associated with directional features.</p> <p>We develop a matrix-free computational algorithm for fitting high-dimensional Gaussian data using a factor model, which can explain the variability of a large set of variables using a small set of factors. Then, we describe a novel family of distributions on the unit sphere that is obtained by radially projecting a Gaussian random variable with factor covariance structure. For practical applications, we further establish a novel matrix-free computational framework for computing maximum likelihood estimates and demonstrate the broad scope of the latent factor model by analyzing data from social networks, resting state functional magnetic resonance imaging experiments, genetics and digital image databases. Finally, we extend the latent factor model to model and cluster the spine morphological data. Our approach produces three spine groups with distinct morphological features, reveals a relationship among the directional variables and their correlations, and characterizes the variability of all the directional and the non-directional features.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/18299/
dc.identifier.articleid 9306
dc.identifier.contextkey 21104717
dc.identifier.doi https://doi.org/10.31274/etd-20210114-34
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/18299
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/94451
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/18299/Dai_iastate_0097E_19081.pdf|||Fri Jan 14 21:39:48 UTC 2022
dc.subject.keywords AECM algorithm
dc.subject.keywords Directional data
dc.subject.keywords Lanczos algorithm
dc.subject.keywords Profile likelihood
dc.subject.keywords Projected normal
dc.title Factor models for big data
dc.type article
dc.type.genre thesis
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
thesis.degree.discipline Statistics
thesis.degree.level thesis
thesis.degree.name Doctor of Philosophy