Simulating Data to Study Performance of Finite Mixture Modeling and Clustering Algorithms

dc.contributor.author Maitra, Ranjan
dc.contributor.author Melnykov, Volodymyr
dc.contributor.department Statistics
dc.date 2018-02-17T18:36:51.000
dc.date.accessioned 2020-07-02T06:58:02Z
dc.date.available 2020-07-02T06:58:02Z
dc.date.copyright Fri Jan 01 00:00:00 UTC 2010
dc.date.issued 2010-01-01
dc.description.abstract <p>A new method is proposed to generate sample Gaussian mixture distributions according to prespecified overlap characteristics. Such methodology is useful in the context of evaluating performance of clustering algorithms. Our suggested approach involves derivation of and calculation of the exact overlap between every cluster pair, measured in terms of their total probability of misclassification, and then guided simulation of Gaussian components satisfying prespecified overlap characteristics. The algorithm is illustrated in two and five dimensions using contour plots and parallel distribution plots, respectively, which we introduce and develop to display mixture distributions in higher dimensions. We also study properties of the algorithm and variability in the simulated mixtures. The utility of the suggested algorithm is demonstrated via a study of initialization strategies in Gaussian clustering. This article has supplementary material online.</p>
dc.description.comments <p>This is an Accepted Manuscript of an article published by Taylor & Francis in J<em>ournal of Computational and Graphical Statistics</em> on January 2012, available online : http://www.tandf.com/<a href="http://dx.doi.org/10.1198/jcgs.2009.08054" target="_blank">10.1198/jcgs.2009.08054</a>.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/stat_las_pubs/72/
dc.identifier.articleid 1076
dc.identifier.contextkey 8822401
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath stat_las_pubs/72
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/90673
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/stat_las_pubs/72/2010_MaitraR_SimulatingDataStudy.pdf|||Sat Jan 15 01:44:06 UTC 2022
dc.source.uri 10.1198/jcgs.2009.08054
dc.subject.disciplines Statistics and Probability
dc.subject.keywords Cluster overlap
dc.subject.keywords Eccentricity of ellipsoid
dc.subject.keywords Mclust
dc.subject.keywords MixSim
dc.subject.keywords Mixture distribution
dc.subject.keywords Parallel distribution plots
dc.title Simulating Data to Study Performance of Finite Mixture Modeling and Clustering Algorithms
dc.type article
dc.type.genre article
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
File
Original bundle
Now showing 1 - 1 of 1
Name:
2010_MaitraR_SimulatingDataStudy.pdf
Size:
1.54 MB
Format:
Adobe Portable Document Format
Description:
Collections