Related papers: k-means requires exponentially many iterations eve…

Improved Smoothed Analysis of the k-Means Method

The k-means method is a widely used clustering algorithm. One of its distinguished features is its speed in practice. Its worst-case running-time, however, is exponential, leaving a gap between practical and theoretical performance. Arthur…

Data Structures and Algorithms · Computer Science 2008-09-11 Bodo Manthey , Heiko Röglin

k-Means has Polynomial Smoothed Complexity

The k-means method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In order to close the gap between…

Data Structures and Algorithms · Computer Science 2009-08-07 David Arthur , Bodo Manthey , Heiko Röglin

A balanced k-means algorithm for weighted point sets

The classical $k$-means algorithm for partitioning $n$ points in $\mathbb{R}^d$ into $k$ clusters is one of the most popular and widely spread clustering methods. The need to respect prescribed lower bounds on the cluster sizes has been…

Optimization and Control · Mathematics 2016-08-04 Steffen Borgwardt , Andreas Brieden , Peter Gritzmann

K+ Means : An Enhancement Over K-Means Clustering Algorithm

K-means (MacQueen, 1967) [1] is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. The procedure follows a simple and easy way to classify a given data set to a predefined, say K number of…

Machine Learning · Computer Science 2017-06-23 Srikanta Kolay , Kumar Sankar Ray , Abhoy Chand Mondal

On Variants of k-means Clustering

\textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems,…

Computational Geometry · Computer Science 2015-12-10 Sayan Bandyapadhyay , Kasturi Varadarajan

Fast Exact k-Means, k-Medians and Bregman Divergence Clustering in 1D

The $k$-Means clustering problem on $n$ points is NP-Hard for any dimension $d\ge 2$, however, for the 1D case there exists exact polynomial time algorithms. Previous literature reported an $O(kn^2)$ time dynamic programming algorithm that…

Data Structures and Algorithms · Computer Science 2018-04-26 Allan Grønlund , Kasper Green Larsen , Alexander Mathiasen , Jesper Sindahl Nielsen , Stefan Schneider , Mingzhou Song

The Bane of Low-Dimensionality Clustering

In this paper, we give a conditional lower bound of $n^{\Omega(k)}$ on running time for the classic k-median and k-means clustering objectives (where n is the size of the input), even in low-dimensional Euclidean space of dimension four,…

Data Structures and Algorithms · Computer Science 2017-11-06 Vincent Cohen-Addad , Arnaud de Mesmay , Eva Rotenberg , Alan Roytman

k2-means for fast and accurate large scale clustering

We propose k^2-means, a new clustering method which efficiently copes with large numbers of clusters and achieves low energy solutions. k^2-means builds upon the standard k-means (Lloyd's algorithm) and combines a new strategy to accelerate…

Machine Learning · Computer Science 2016-05-31 Eirikur Agustsson , Radu Timofte , Luc Van Gool

Fast k-means algorithm clustering

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of…

Data Structures and Algorithms · Computer Science 2011-08-08 Raied Salman , Vojislav Kecman , Qi Li , Robert Strack , Erik Test

Faster Algorithms for the Constrained k-means Problem

The classical center based clustering problems such as $k$-means/median/center assume that the optimal clusters satisfy the locality property that the points in the same cluster are close to each other. A number of clustering problems arise…

Data Structures and Algorithms · Computer Science 2015-04-13 Anup Bhattacharya , Ragesh Jaiswal , Amit Kumar

Almost-Optimal Upper and Lower Bounds for Clustering in Low Dimensional Euclidean Spaces

The $k$-median and $k$-means clustering objectives are classic objectives for modeling clustering in a metric space. Given a set of points in a metric space, the goal of the $k$-median (resp. $k$-means) problem is to find $k$ representative…

Computational Geometry · Computer Science 2026-03-11 Vincent Cohen-Addad , Karthik C. S. , David Saulpic , Chris Schwiegelshohn

Accelerating k-Means Clustering with Cover Trees

The k-means clustering algorithm is a popular algorithm that partitions data into k clusters. There are many improvements to accelerate the standard algorithm. Most current research employs upper and lower bounds on point-to-cluster…

Machine Learning · Computer Science 2024-10-22 Andreas Lang , Erich Schubert

Near-Optimal Explainable $k$-Means for All Dimensions

Many clustering algorithms are guided by certain cost functions such as the widely-used $k$-means cost. These algorithms divide data points into clusters with often complicated boundaries, creating difficulties in explaining the clustering…

Machine Learning · Computer Science 2021-11-05 Moses Charikar , Lunjia Hu

Fast $k$-means Seeding Under The Manifold Hypothesis

We study beyond worst case analysis for the $k$-means problem where the goal is to model typical instances of $k$-means arising in practice. Existing theoretical approaches provide guarantees under certain assumptions on the optimal…

Data Structures and Algorithms · Computer Science 2026-02-03 Poojan Shah , Shashwat Agrawal , Ragesh Jaiswal

On Generalization Bounds for Projective Clustering

Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the…

Machine Learning · Computer Science 2023-10-16 Maria Sofia Bucarelli , Matilde Fjeldsø Larsen , Chris Schwiegelshohn , Mads Bech Toftrup

Exact Exponential Algorithms for Clustering Problems

In this paper we initiate a systematic study of exact algorithms for well-known clustering problems, namely $k$-Median and $k$-Means. In $k$-Median, the input consists of a set $X$ of $n$ points belonging to a metric space, and the task is…

Data Structures and Algorithms · Computer Science 2022-08-16 Fedor V. Fomin , Petr A. Golovach , Tanmay Inamdar , Nidhi Purohit , Saket Saurabh

Explainable $k$-Means and $k$-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

Global $k$-means$++$: an effective relaxation of the global $k$-means clustering algorithm

The $k$-means algorithm is a prevalent clustering method due to its simplicity, effectiveness, and speed. However, its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a…

Machine Learning · Computer Science 2023-07-17 Georgios Vardakas , Aristidis Likas

Seeding K-Means using Method of Moments

K-means is one of the most widely used algorithms for clustering in Data Mining applications, which attempts to minimize the sum of the square of the Euclidean distance of the points in the clusters from the respective means of the…

Machine Learning · Computer Science 2016-11-01 Sayantan Dasgupta

Near-Optimal Bounds for Parameterized Euclidean k-means

The $k$-means problem is a classic objective for modeling clustering in a metric space. Given a set of points in a metric space, the goal is to find $k$ representative points so as to minimize the sum of the squared distances from each…

Computational Geometry · Computer Science 2026-03-31 Vincent Cohen-Addad , Karthik C. S. , David Saulpic , Chris Schwiegelshohn