A separability index for clustering and classification problems with applications to cluster merging and systematic evaluation of clustering algorithms

dc.contributor.advisor Arka P. Ghosh
dc.contributor.advisor Ranjan Maitra
dc.contributor.author Peterson, Anna
dc.contributor.department Statistics
dc.date 2018-08-11T11:39:29.000
dc.date.accessioned 2020-06-30T02:40:50Z
dc.date.available 2020-06-30T02:40:50Z
dc.date.copyright Sat Jan 01 00:00:00 UTC 2011
dc.date.embargo 2013-06-05
dc.date.issued 2011-01-01
dc.description.abstract <p>A separability index quantifying the degree of difficulty in a hard clustering problem is proposed under assumptions of a multivariate Gaussian distribution for each group. We first define a preliminary index and explore its properties both theoretically and numerically. Adjustments are then made to this index so that the final refinement is also interpretable in terms of the Adjusted Rand Index between a true grouping and its hypothetical idealized clustering, taken as a surrogate of clustering complexity. Our derived index is used to develop a data-simulation algorithm that generates samples according to the prescribed value of the index. This algorithm is particularly useful for systematically generating datasets with varying degrees of clustering difficulty which we use to evaluate performance of different clustering algorithms. The index is also shown to be useful in providing a summary of the distinctiveness of classes in grouped datasets.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/12173/
dc.identifier.articleid 3186
dc.identifier.contextkey 2808384
dc.identifier.doi https://doi.org/10.31274/etd-180810-1463
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/12173
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/26366
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/12173/Peterson_iastate_0097E_11961.pdf|||Fri Jan 14 19:14:33 UTC 2022
dc.subject.disciplines Statistics and Probability
dc.subject.keywords clustering
dc.subject.keywords cluster merging
dc.subject.keywords K-means
dc.subject.keywords separability index
dc.title A separability index for clustering and classification problems with applications to cluster merging and systematic evaluation of clustering algorithms
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
Name:
Peterson_iastate_0097E_11961.pdf
Size:
55.4 MB
Format:
Adobe Portable Document Format
Description: