English
Related papers

Related papers: Label-consistent clustering for evolving data

200 papers

Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focus on the stability of cluster centers; unfortunately, this neglects the…

Data Structures and Algorithms · Computer Science 2025-12-23 Diptarka Chakraborty , Hendrik Fichtenberger , Bernhard Haeupler , Silvio Lattanzi , Ashkan Norouzi-Fard , Ola Svensson

The problem of constrained $k$-center clustering has attracted significant attention in the past decades. In this paper, we study balanced $k$-center cluster where the size of each cluster is constrained by the given lower and upper bounds.…

Computational Geometry · Computer Science 2017-04-11 Hu Ding

The problem of constrained clustering has attracted significant attention in the past decades. In this paper, we study the balanced $k$-center, $k$-median, and $k$-means clustering problems where the size of each cluster is constrained by…

Computational Geometry · Computer Science 2018-09-11 Hu Ding

We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the…

Data Structures and Algorithms · Computer Science 2023-07-27 Jakub Łącki , Bernhard Haeupler , Christoph Grunau , Václav Rozhoň , Rajesh Jayaram

Clustering is one of the most fundamental problems in unsupervised learning with a large number of applications. However, classical clustering algorithms assume that the data is static, thus failing to capture many real-world applications…

Data Structures and Algorithms · Computer Science 2020-02-11 Gramoz Goranci , Monika Henzinger , Dariusz Leniowski , Christian Schulz , Alexander Svozil

We study two generalizations of classic clustering problems called dynamic ordered $k$-median and dynamic $k$-supplier, where the points that need clustering evolve over time, and we are allowed to move the cluster centers between…

Data Structures and Algorithms · Computer Science 2022-07-26 Shichuan Deng , Jian Li , Yuval Rabani

The $k$-center problem is a fundamental clustering variant with applications in learning systems and data summarization. In several real-world scenarios, the dataset to be clustered is not static, but evolves over time, as new data points…

Data Structures and Algorithms · Computer Science 2026-03-25 Simone Moretti , Paolo Pellizzoni , Andrea Pietracaprina , Geppino Pucci

In this paper, we investigate the learning-augmented $k$-median clustering problem, which aims to improve the performance of traditional clustering algorithms by preprocessing the point set with a predictor of error rate $\alpha \in [0,1)$.…

Data Structures and Algorithms · Computer Science 2026-03-12 Kangke Cheng , Shihong Song , Guanlin Mo , Hu Ding

Given points from an arbitrary metric space and a sequence of point updates sent by an adversary, what is the minimum recourse per update (i.e., the minimum number of changes needed to the set of centers after an update), in order to…

Data Structures and Algorithms · Computer Science 2025-06-04 Sebastian Forster , Antonis Skarlatos

We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be…

Machine Learning · Computer Science 2023-03-02 Thy Nguyen , Anamay Chaturvedi , Huy Lê Nguyen

Center-based clustering techniques are fundamental in some areas of machine learning such as data summarization. Generic $k$-center algorithms can produce biased cluster representatives so there has been a recent interest in fair $k$-center…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-21 Jinxiang Gan , Mordecai Golin , Zonghan Yang , Yuhao Zhang

Given a stream of points in a metric space, is it possible to maintain a constant approximate clustering by changing the cluster centers only a small number of times during the entire execution of the algorithm? This question received…

Data Structures and Algorithms · Computer Science 2020-11-16 Hendrik Fichtenberger , Silvio Lattanzi , Ashkan Norouzi-Fard , Ola Svensson

$K$-means, a simple and effective clustering algorithm, is one of the most widely used algorithms in multimedia and computer vision community. Traditional $k$-means is an iterative algorithm---in each iteration new cluster centers are…

Computer Vision and Pattern Recognition · Computer Science 2013-12-12 Jingdong Wang , Jing Wang , Qifa Ke , Gang Zeng , Shipeng Li

In discrete k-center and k-median clustering, we are given a set of points P in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k, such that the cost of clustering P using C is as small as possible. For k-center,…

Data Structures and Algorithms · Computer Science 2013-07-10 Nirman Kumar , Benjamin Raichel

We study k-median clustering under the sequential no-substitution setting. In this setting, a data stream is sequentially observed, and some of the points are selected by the algorithm as cluster centers. However, a point can be selected as…

Machine Learning · Computer Science 2022-04-14 Tom Hess , Michal Moshkovitz , Sivan Sabato

Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no…

Machine Learning · Computer Science 2021-04-27 Vincent Lemaire , Oumaima Alaoui Ismaili , Antoine Cornuéjols , Dominique Gay

The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…

Machine Learning · Statistics 2018-01-10 Marco Capó , Aritz Pérez , Jose A. Lozano

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

In data summarization we want to choose $k$ prototypes in order to summarize a data set. We study a setting where the data set comprises several demographic groups and we are restricted to choose $k_i$ prototypes belonging to group $i$. A…

Machine Learning · Statistics 2019-05-14 Matthäus Kleindessner , Pranjal Awasthi , Jamie Morgenstern

One key use of k-means clustering is to identify cluster prototypes which can serve as representative points for a dataset. However, a drawback of using k-means cluster centers as representative points is that such points distort the…

Machine Learning · Statistics 2019-11-15 Arvind Krishna , Simon Mak , Roshan Joseph
‹ Prev 1 2 3 10 Next ›