English

Label-consistent clustering for evolving data

Data Structures and Algorithms 2025-12-18 v1 Machine Learning

Abstract

Data analysis often involves an iterative process, where solutions must be continuously refined in response to new data. Typically, as new data becomes available, an existing solution must be updated to incorporate the latest information. In addition to seeking a high-quality solution for the task at hand, it is also crucial to ensure consistency by minimizing drastic changes from previous solutions. Applying this approach across many iterations, ensures that the solution evolves gradually and smoothly. In this paper, we study the above problem in the context of clustering, specifically focusing on the kk-center problem. More precisely, we study the following problem: Given a set of points XX, parameters kk and bb, and a prior clustering solution HH for XX, our goal is to compute a new solution CC for XX, consisting of kk centers, which minimizes the clustering cost while introducing at most bb changes from HH. We refer to this problem as label-consistent kk-center, and we propose two constant-factor approximation algorithms for it. We complement our theoretical findings with an experimental evaluation demonstrating the effectiveness of our methods on real-world datasets.

Keywords

Cite

@article{arxiv.2512.15210,
  title  = {Label-consistent clustering for evolving data},
  author = {Ameet Gadekar and Aristides Gionis and Thibault Marette},
  journal= {arXiv preprint arXiv:2512.15210},
  year   = {2025}
}

Comments

26 pages