Merging K‐means with hierarchical clustering for identifying general‐shaped groups

Date
2018-01-01
Authors
Peterson, Anna
Ghosh, Arka
Maitra, Ranjan
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Statistics
Organizational Unit
Journal Issue
Series
Department
Statistics
Abstract

Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and K‐means clustering are two approaches but have different strengths and weaknesses. For instance, hierarchical clustering identifies groups in a tree‐like structure but suffers from computational complexity in large datasets, while K‐means clustering is efficient but designed to identify homogeneous spherically shaped clusters. We present a hybrid non‐parametric clustering approach that amalgamates the two methods to identify general‐shaped clusters and that can be applied to larger datasets. Specifically, we first partition the dataset into spherical groups using K‐means. We next merge these groups using hierarchical methods with a data‐driven distance measure as a stopping criterion. Our proposal has the potential to reveal groups with general shapes and structure in a dataset. We demonstrate good performance on several simulated and real datasets.

Comments

This is the peer-reviewed version of the following article: Peterson, Anna D., Arka P. Ghosh, and Ranjan Maitra. "Merging K‐means with hierarchical clustering for identifying general‐shaped groups." Stat 7, no. 1 (2018): e172, which has been published in final form at DOI: 10.1002/sta4.172. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.

Description
Keywords
Citation
DOI
Collections