English

The K-modes algorithm for clustering

Machine Learning 2013-04-25 v1 Methodology Machine Learning

Abstract

Many clustering algorithms exist that estimate a cluster centroid, such as K-means, K-medoids or mean-shift, but no algorithm seems to exist that clusters data by returning exactly K meaningful modes. We propose a natural definition of a K-modes objective function by combining the notions of density and cluster assignment. The algorithm becomes K-means and K-medoids in the limit of very large and very small scales. Computationally, it is slightly slower than K-means but much faster than mean-shift or K-medoids. Unlike K-means, it is able to find centroids that are valid patterns, truly representative of a cluster, even with nonconvex clusters, and appears robust to outliers and misspecification of the scale and number of clusters.

Keywords

Cite

@article{arxiv.1304.6478,
  title  = {The K-modes algorithm for clustering},
  author = {Miguel Á. Carreira-Perpiñán and Weiran Wang},
  journal= {arXiv preprint arXiv:1304.6478},
  year   = {2013}
}

Comments

13 pages, 5 figures

R2 v1 2026-06-22T00:05:17.189Z