Related papers: Approximate Correlation Clustering Using Same-Clus…

Approximate Clustering with Same-Cluster Queries

Ashtiani et al. proposed a Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to make adaptive queries to a domain expert. The queries are of the kind "do two given points belong to the same optimal cluster?"…

Data Structures and Algorithms · Computer Science 2017-10-05 Nir Ailon , Anup Bhattacharya , Ragesh Jaiswal , Amit Kumar

Clustering with Same-Cluster Queries

We propose a framework for Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to interact with a domain expert, asking whether two given instances belong to the same cluster or not. We study the query and…

Machine Learning · Computer Science 2016-11-23 Hassan Ashtiani , Shrinu Kushagra , Shai Ben-David

A Note on the Inapproximability of Correlation Clustering

We consider inapproximability of the correlation clustering problem defined as follows: Given a graph $G = (V,E)$ where each edge is labeled either "+" (similar) or "-" (dissimilar), correlation clustering seeks to partition the vertices…

Machine Learning · Computer Science 2009-03-23 Jinsong Tan

Correlation Clustering with a Fixed Number of Clusters

We continue the investigation of problems concerning correlation clustering or clustering with qualitative information, which is a clustering formulation that has been studied recently. The basic setup here is that we are given as input a…

Data Structures and Algorithms · Computer Science 2007-05-23 Ioannis Giotis , Venkatesan Guruswami

Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost

Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full…

Data Structures and Algorithms · Computer Science 2019-08-15 Barna Saha , Sanjay Subramanian

A PTAS for the Minimum Consensus Clustering Problem with a Fixed Number of Clusters

The Consensus Clustering problem has been introduced as an effective way to analyze the results of different microarray experiments. The problem consists of looking for a partition that best summarizes a set of input partitions (each…

Data Structures and Algorithms · Computer Science 2009-07-13 Paola Bonizzoni , Gianluca Della Vedova , Riccardo Dondi

Correlation Clustering in Constant Many Parallel Rounds

Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of…

Data Structures and Algorithms · Computer Science 2021-06-17 Vincent Cohen-Addad , Silvio Lattanzi , Slobodan Mitrović , Ashkan Norouzi-Fard , Nikos Parotsidis , Jakub Tarnawski

Combinatorial Correlation Clustering

Correlation Clustering is a classic clustering objective arising in numerous machine learning and data mining applications. Given a graph $G=(V,E)$, the goal is to partition the vertex set into clusters so as to minimize the number of edges…

Data Structures and Algorithms · Computer Science 2024-07-17 Vincent Cohen-Addad , David Rasmussen Lolck , Marcin Pilipczuk , Mikkel Thorup , Shuyi Yan , Hanwen Zhang

Solving the Correlation Cluster LP in Sublinear Time

Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of…

Data Structures and Algorithms · Computer Science 2025-11-05 Nairen Cao , Vincent Cohen-Addad , Shi Li , Euiwoong Lee , David Rasmussen Lolck , Alantha Newman , Mikkel Thorup , Lukas Vogl , Shuyi Yan , Hanwen Zhang

Near-Optimal Correlation Clustering with Privacy

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set…

Machine Learning · Computer Science 2022-03-04 Vincent Cohen-Addad , Chenglin Fan , Silvio Lattanzi , Slobodan Mitrović , Ashkan Norouzi-Fard , Nikos Parotsidis , Jakub Tarnawski

Correlation Clustering with Low-Rank Matrices

Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature…

Machine Learning · Computer Science 2017-03-20 Nate Veldt , Anthony Wirth , David F. Gleich

Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering

We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman[DMN24]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all…

Data Structures and Algorithms · Computer Science 2024-10-23 Nairen Cao , Shi Li , Jia Ye

On Variants of k-means Clustering

\textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems,…

Computational Geometry · Computer Science 2015-12-10 Sayan Bandyapadhyay , Kasturi Varadarajan

Almost 3-Approximate Correlation Clustering in Constant Rounds

We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of…

Data Structures and Algorithms · Computer Science 2022-05-10 Soheil Behnezhad , Moses Charikar , Weiyun Ma , Li-Yang Tan

Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions

We present a new approach for solving (minimum disagreement) correlation clustering that results in sublinear algorithms with highly efficient time and space complexity for this problem. In particular, we obtain the following algorithms for…

Data Structures and Algorithms · Computer Science 2021-09-30 Sepehr Assadi , Chen Wang

Improved algorithms for Correlation Clustering with local objectives

Correlation Clustering is a powerful graph partitioning model that aims to cluster items based on the notion of similarity between items. An instance of the Correlation Clustering problem consists of a graph $G$ (not necessarily complete)…

Data Structures and Algorithms · Computer Science 2019-06-25 Sanchit Kalhan , Konstantin Makarychev , Timothy Zhou

Spectral Clustering with Side Information

In the graph clustering problem with a planted solution, the input is a graph on $n$ vertices partitioned into $k$ clusters, and the task is to infer the clusters from graph structure. A standard assumption is that clusters induce…

Data Structures and Algorithms · Computer Science 2025-11-24 Hendrik Fichtenberger , Michael Kapralov , Ekaterina Kochetkova , Silvio Lattanzi , Davide Mazzali , Weronika Wrzos-Kaminska

Correlation Clustering with Adaptive Similarity Queries

In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we…

Machine Learning · Computer Science 2020-01-15 Marco Bressan , Nicolò Cesa-Bianchi , Andrea Paudice , Fabio Vitale

An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering

The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve the cluster quality and promote…

Optimization and Control · Mathematics 2022-07-26 Veronica Piccialli , Anna Russo Russo , Antonio M. Sudoso

Breaking 3-Factor Approximation for Correlation Clustering in Polylogarithmic Rounds

In this paper, we study parallel algorithms for the correlation clustering problem, where every pair of two different entities is labeled with similar or dissimilar. The goal is to partition the entities into clusters to minimize the number…

Data Structures and Algorithms · Computer Science 2023-07-14 Nairen Cao , Shang-En Huang , Hsin-Hao Su