Related papers: Local correlation clustering
Correlation clustering is arguably the most natural formulation of clustering. Given n objects and a pairwise similarity measure, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same…
Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of…
We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of…
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set…
Correlation clustering is a widely studied framework for clustering based on pairwise similarity and dissimilarity scores, but its best approximation algorithms rely on impractical linear programming relaxations. We present faster…
Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full…
Correlation Clustering is a classic clustering objective arising in numerous machine learning and data mining applications. Given a graph $G=(V,E)$, the goal is to partition the vertex set into clusters so as to minimize the number of edges…
Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature…
We propose the first \emph{local search} algorithm for Euclidean clustering that attains an $O(1)$-approximation in almost-linear time. Specifically, for Euclidean $k$-Means, our algorithm achieves an $O(c)$-approximation in $\tilde{O}(n^{1…
Properly estimating correlations between objects at different spatial scales necessitates $\mathcal{O}(n^2)$ distance calculations. For this reason, most widely adopted packages for estimating correlations use clustering algorithms to…
We introduce fast algorithms for correlation clustering with respect to the Min Max objective that provide constant factor approximations on complete graphs. Our algorithms are the first purely combinatorial approximation algorithms for…
Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement…
Clustering is a fundamental task in machine learning and data analysis, but it frequently fails to provide fair representation for various marginalized communities defined by multiple protected attributes -- a shortcoming often caused by…
Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms…
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of…
We consider a generalized version of the correlation clustering problem, defined as follows. Given a complete graph $G$ whose edges are labeled with $+$ or $-$, we wish to partition the graph into clusters while trying to avoid errors: $+$…
In this paper, we study parallel algorithms for the correlation clustering problem, where every pair of two different entities is labeled with similar or dissimilar. The goal is to partition the entities into clusters to minimize the number…
In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we…
We present combinatorial approximation algorithms for the weighted correlation clustering problem. In this problem, we have a set of vertices and two weight values for each pair of vertices, denoting their difference and similarity. The…
We establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering to the multilayer setting. In this model, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$ of…