Related papers: Universal Algorithms for Clustering Problems

Exact Exponential Algorithms for Clustering Problems

In this paper we initiate a systematic study of exact algorithms for well-known clustering problems, namely $k$-Median and $k$-Means. In $k$-Median, the input consists of a set $X$ of $n$ points belonging to a metric space, and the task is…

Data Structures and Algorithms · Computer Science 2022-08-16 Fedor V. Fomin , Petr A. Golovach , Tanmay Inamdar , Nidhi Purohit , Saket Saurabh

Competitively Consistent Clustering

In fully-dynamic consistent clustering, we are given a finite metric space $(M,d)$, and a set $F\subseteq M$ of possible locations for opening centers. Data points arrive and depart, and the goal is to maintain an approximately optimal…

Data Structures and Algorithms · Computer Science 2025-08-15 Niv Buchbinder , Roie Levin , Yue Yang

Online k-means Clustering

We study the problem of online clustering where a clustering algorithm has to assign a new point that arrives to one of $k$ clusters. The specific formulation we use is the $k$-means objective: At each time step the algorithm has to…

Machine Learning · Computer Science 2021-04-22 Vincent Cohen-Addad , Benjamin Guedj , Varun Kanade , Guy Rom

Faster Algorithms for the Constrained k-means Problem

The classical center based clustering problems such as $k$-means/median/center assume that the optimal clusters satisfy the locality property that the points in the same cluster are close to each other. A number of clustering problems arise…

Data Structures and Algorithms · Computer Science 2015-04-13 Anup Bhattacharya , Ragesh Jaiswal , Amit Kumar

Near-Optimal Clustering in the $k$-machine model

The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-24 Sayan Bandyapadhyay , Tanmay Inamdar , Shreyas Pai , Sriram V. Pemmaraju

A Global Optimization Algorithm for K-Center Clustering of One Billion Samples

This paper presents a practical global optimization algorithm for the K-center clustering problem, which aims to select K samples as the cluster centers to minimize the maximum within-cluster distance. This algorithm is based on a…

Optimization and Control · Mathematics 2026-03-04 Jiayang Ren , Ningning You , Kaixun Hua , Chaojie Ji , Yankai Cao

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

$k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains…

Quantum Physics · Physics 2023-06-06 Yecheng Xue , Xiaoyu Chen , Tongyang Li , Shaofeng H. -C. Jiang

Optimal Time Bounds for Approximate Clustering

Clustering is a fundamental problem in unsupervised learning, and has been studied widely both as a problem of learning mixture models and as an optimization problem. In this paper, we study clustering with respect the emph{k-median}…

Data Structures and Algorithms · Computer Science 2013-01-07 Ramgopal Mettu , Greg Plaxton

On Generalization Bounds for Projective Clustering

Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the…

Machine Learning · Computer Science 2023-10-16 Maria Sofia Bucarelli , Matilde Fjeldsø Larsen , Chris Schwiegelshohn , Mads Bech Toftrup

Explainable $k$-Means and $k$-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

Clustering to Minimize Cluster-Aware Norm Objectives

We initiate the study of the following general clustering problem. We seek to partition a given set $P$ of data points into $k$ clusters by finding a set $X$ of $k$ centers and assigning each data point to one of the centers. The cost of a…

Data Structures and Algorithms · Computer Science 2024-11-01 Martin G. Herold , Evangelos Kipouridis , Joachim Spoerhase

Faster Balanced Clusterings in High Dimension

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding

Simple, Scalable and Effective Clustering via One-Dimensional Projections

Clustering is a fundamental problem in unsupervised machine learning with many applications in data analysis. Popular clustering algorithms such as Lloyd's algorithm and $k$-means++ can take $\Omega(ndk)$ time when clustering $n$ points in…

Machine Learning · Computer Science 2023-10-26 Moses Charikar , Monika Henzinger , Lunjia Hu , Maxmilian Vötsch , Erik Waingarten

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be…

Machine Learning · Computer Science 2023-03-02 Thy Nguyen , Anamay Chaturvedi , Huy Lê Nguyen

Analysis of Agglomerative Clustering

The diameter $k$-clustering problem is the problem of partitioning a finite subset of $\mathbb{R}^d$ into $k$ subsets called clusters such that the maximum diameter of the clusters is minimized. One early clustering algorithm that computes…

Data Structures and Algorithms · Computer Science 2014-03-10 Marcel R. Ackermann , Johannes Blömer , Daniel Kuntze , Christian Sohler

Global $k$-means$++$: an effective relaxation of the global $k$-means clustering algorithm

The $k$-means algorithm is a prevalent clustering method due to its simplicity, effectiveness, and speed. However, its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a…

Machine Learning · Computer Science 2023-07-17 Georgios Vardakas , Aristidis Likas

Fast Clustering using MapReduce

Clustering problems have numerous applications and are becoming more challenging as the size of the data increases. In this paper, we consider designing clustering algorithms that can be used in MapReduce, the most popular programming…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-09 Alina Ene , Sungjin Im , Benjamin Moseley

Near-Optimal Algorithms for Constrained k-Center Clustering with Instance-level Background Knowledge

Center-based clustering has attracted significant research interest from both theory and practice. In many practical applications, input data often contain background knowledge that can be used to improve clustering results. In this work,…

Machine Learning · Computer Science 2025-06-13 Longkun Guo , Chaoqi Jia , Kewen Liao , Zhigang Lu , Minhui Xue

A cutting plane algorithm for globally solving low dimensional k-means clustering problems

Clustering is one of the most fundamental tools in data science and machine learning, and k-means clustering is one of the most common such methods. There is a variety of approximate algorithms for the k-means problem, but computing the…

Optimization and Control · Mathematics 2024-02-22 Martin Ryner , Jan Kronqvist , Johan Karlsson

A bi-criteria approximation algorithm for $k$ Means

We consider the classical $k$-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output $\beta k > k$ clusters, and must produce a clustering with cost at most $\alpha$ times the to the…

Data Structures and Algorithms · Computer Science 2015-08-04 Konstantin Makarychev , Yury Makarychev , Maxim Sviridenko , Justin Ward