Related papers: Local correlation clustering

Query-Efficient Correlation Clustering

Correlation clustering is arguably the most natural formulation of clustering. Given n objects and a pairwise similarity measure, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same…

Data Structures and Algorithms · Computer Science 2020-02-27 David García-Soriano , Konstantin Kutzkov , Francesco Bonchi , Charalampos Tsourakakis

Solving the Correlation Cluster LP in Sublinear Time

Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of…

Data Structures and Algorithms · Computer Science 2025-11-05 Nairen Cao , Vincent Cohen-Addad , Shi Li , Euiwoong Lee , David Rasmussen Lolck , Alantha Newman , Mikkel Thorup , Lukas Vogl , Shuyi Yan , Hanwen Zhang

Almost 3-Approximate Correlation Clustering in Constant Rounds

We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of…

Data Structures and Algorithms · Computer Science 2022-05-10 Soheil Behnezhad , Moses Charikar , Weiyun Ma , Li-Yang Tan

Near-Optimal Correlation Clustering with Privacy

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set…

Machine Learning · Computer Science 2022-03-04 Vincent Cohen-Addad , Chenglin Fan , Silvio Lattanzi , Slobodan Mitrović , Ashkan Norouzi-Fard , Nikos Parotsidis , Jakub Tarnawski

Correlation Clustering via Strong Triadic Closure Labeling: Fast Approximation Algorithms and Practical Lower Bounds

Correlation clustering is a widely studied framework for clustering based on pairwise similarity and dissimilarity scores, but its best approximation algorithms rely on impractical linear programming relaxations. We present faster…

Data Structures and Algorithms · Computer Science 2022-06-27 Nate Veldt

Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost

Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full…

Data Structures and Algorithms · Computer Science 2019-08-15 Barna Saha , Sanjay Subramanian

Combinatorial Correlation Clustering

Correlation Clustering is a classic clustering objective arising in numerous machine learning and data mining applications. Given a graph $G=(V,E)$, the goal is to partition the vertex set into clusters so as to minimize the number of edges…

Data Structures and Algorithms · Computer Science 2024-07-17 Vincent Cohen-Addad , David Rasmussen Lolck , Marcin Pilipczuk , Mikkel Thorup , Shuyi Yan , Hanwen Zhang

Correlation Clustering with Low-Rank Matrices

Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature…

Machine Learning · Computer Science 2017-03-20 Nate Veldt , Anthony Wirth , David F. Gleich

Local Search for Clustering in Almost-linear Time

We propose the first \emph{local search} algorithm for Euclidean clustering that attains an $O(1)$-approximation in almost-linear time. Specifically, for Euclidean $k$-Means, our algorithm achieves an $O(c)$-approximation in $\tilde{O}(n^{1…

Data Structures and Algorithms · Computer Science 2025-04-07 Shaofeng H. -C. Jiang , Yaonan Jin , Jianing Lou , Pinyan Lu

On Soft Clustering For Correlation Estimators

Properly estimating correlations between objects at different spatial scales necessitates $\mathcal{O}(n^2)$ distance calculations. For this reason, most widely adopted packages for estimating correlations use clustering algorithms to…

Instrumentation and Methods for Astrophysics · Physics 2025-09-15 Edward Berman , Sneh Pandya , Jacqueline McCleary , Marko Shuntov , Caitlin Casey , Nicole Drakos , Andreas Faisst , Steven Gillman , Ghassem Gozaliasl , Natalie Hogg , Jeyhan Kartaltepe , Anton Koekemoer , Wilfried Mercier , Diana Scognamiglio , COSMOS-Web , : , The JWST Cosmic Origins Survey

Fast Combinatorial Algorithms for Min Max Correlation Clustering

We introduce fast algorithms for correlation clustering with respect to the Min Max objective that provide constant factor approximations on complete graphs. Our algorithms are the first purely combinatorial approximation algorithms for…

Data Structures and Algorithms · Computer Science 2023-01-31 Sami Davies , Benjamin Moseley , Heather Newman

Crowdsourced correlation clustering with relative distance comparisons

Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement…

Data Structures and Algorithms · Computer Science 2017-09-26 Antti Ukkonen

Generalizing Fair Clustering to Multiple Groups: Algorithms and Applications

Clustering is a fundamental task in machine learning and data analysis, but it frequently fails to provide fair representation for various marginalized communities defined by multiple protected attributes -- a shortcoming often caused by…

Machine Learning · Computer Science 2025-11-17 Diptarka Chakraborty , Kushagra Chatterjee , Debarati Das , Tien-Long Nguyen

Correlation Clustering in Data Streams

Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms…

Data Structures and Algorithms · Computer Science 2018-12-06 Kook Jin Ahn , Graham Cormode , Sudipto Guha , Andrew McGregor , Anthony Wirth

Correlation Clustering in Constant Many Parallel Rounds

Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of…

Data Structures and Algorithms · Computer Science 2021-06-17 Vincent Cohen-Addad , Silvio Lattanzi , Slobodan Mitrović , Ashkan Norouzi-Fard , Nikos Parotsidis , Jakub Tarnawski

Correlation Clustering and Biclustering with Locally Bounded Errors

We consider a generalized version of the correlation clustering problem, defined as follows. Given a complete graph $G$ whose edges are labeled with $+$ or $-$, we wish to partition the graph into clusters while trying to avoid errors: $+$…

Data Structures and Algorithms · Computer Science 2016-05-25 Gregory J. Puleo , Olgica Milenkovic

Breaking 3-Factor Approximation for Correlation Clustering in Polylogarithmic Rounds

In this paper, we study parallel algorithms for the correlation clustering problem, where every pair of two different entities is labeled with similar or dissimilar. The goal is to partition the entities into clusters to minimize the number…

Data Structures and Algorithms · Computer Science 2023-07-14 Nairen Cao , Shang-En Huang , Hsin-Hao Su

Correlation Clustering with Adaptive Similarity Queries

In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we…

Machine Learning · Computer Science 2020-01-15 Marco Bressan , Nicolò Cesa-Bianchi , Andrea Paudice , Fabio Vitale

Improved Combinatorial Approximations for Weighted Correlation Clustering

We present combinatorial approximation algorithms for the weighted correlation clustering problem. In this problem, we have a set of vertices and two weight values for each pair of vertices, denoting their difference and similarity. The…

Data Structures and Algorithms · Computer Science 2025-07-16 Mojtaba Ostovari , Alireza Zarei

Multilayer Correlation Clustering

We establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering to the multilayer setting. In this model, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$ of…

Data Structures and Algorithms · Computer Science 2026-05-20 Atsushi Miyauchi , Florian Adriaens , Francesco Bonchi , Nikolaj Tatti