## Principal Components Analysis of Discrete Datasets

 dc.contributor.author Zhu, Yifan dc.contributor.department Statistics dc.contributor.majorProfessor Ranjan Maitra dc.date 2019-09-22T11:14:44.000 dc.date.accessioned 2020-06-30T01:32:30Z dc.date.available 2020-06-30T01:32:30Z dc.date.copyright Mon Jan 01 00:00:00 UTC 2018 dc.date.issued 2018-01-01 dc.description.abstract

We propose a Gaussian copula based method to perform principal component analysis for discrete data. By assuming the data are from a discrete distributions in the Gaussian copula family, we can consider the discrete random vectors are generated from a latent multivariate normal random vector. So we first obtain an estimate of the correlation matrix of latent multivariate normal distribution, then we use the estimated latent correlation matrix to get the estimates of principal components. We also focus on the case when we have categorical sequence data with multinomial marginal distribution. In this case the marginal distribution is not univariate and thus the usual Gaussian copula does not fit here. The optimal mapping method is proposed to convert the original data with multivariate discrete marginals to the mapped data with univariate marginals. Then the usual Gaussian copula can be used to model the mapped data, and we apply the discrete principal component analysis to the mapped data. The senators' voting data was used in the experiment as an example. Finally, we also propose a matrix Gaussian copula method to deal with data with multivariate marginals. It can be considered as an extension of Gaussian copula, and we use the latent correlation matrix in the matrix Gaussian copula to obtain the principal components.

dc.format.mimetype application/pdf dc.identifier archive/lib.dr.iastate.edu/creativecomponents/121/ dc.identifier.articleid 1128 dc.identifier.contextkey 13405203 dc.identifier.s3bucket isulib-bepress-aws-west dc.identifier.submissionpath creativecomponents/121 dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/16648 dc.source.bitstream archive/lib.dr.iastate.edu/creativecomponents/121/Discrete_Copula_PCA.pdf|||Fri Jan 14 19:13:01 UTC 2022 dc.subject.disciplines Categorical Data Analysis dc.subject.disciplines Statistical Methodology dc.subject.keywords Gaussian copula dc.subject.keywords PCA dc.subject.keywords dimension reduction dc.subject.keywords discrete data dc.title Principal Components Analysis of Discrete Datasets dc.type article dc.type.genre creativecomponent dspace.entity.type Publication relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca thesis.degree.discipline Statistics thesis.degree.level creativecomponent
##### Original bundle
Now showing 1 - 1 of 1
Name:
Discrete_Copula_PCA.pdf
Size:
953.98 KB
Format: