Related papers: Approximate Correlation Clustering Using Same-Clus…
Ashtiani et al. proposed a Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to make adaptive queries to a domain expert. The queries are of the kind "do two given points belong to the same optimal cluster?"…
We propose a framework for Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to interact with a domain expert, asking whether two given instances belong to the same cluster or not. We study the query and…
We consider inapproximability of the correlation clustering problem defined as follows: Given a graph $G = (V,E)$ where each edge is labeled either "+" (similar) or "-" (dissimilar), correlation clustering seeks to partition the vertices…
We continue the investigation of problems concerning correlation clustering or clustering with qualitative information, which is a clustering formulation that has been studied recently. The basic setup here is that we are given as input a…
Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full…
The Consensus Clustering problem has been introduced as an effective way to analyze the results of different microarray experiments. The problem consists of looking for a partition that best summarizes a set of input partitions (each…
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of…
Correlation Clustering is a classic clustering objective arising in numerous machine learning and data mining applications. Given a graph $G=(V,E)$, the goal is to partition the vertex set into clusters so as to minimize the number of edges…
Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of…
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set…
Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature…
We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman[DMN24]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all…
\textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems,…
We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of…
We present a new approach for solving (minimum disagreement) correlation clustering that results in sublinear algorithms with highly efficient time and space complexity for this problem. In particular, we obtain the following algorithms for…
Correlation Clustering is a powerful graph partitioning model that aims to cluster items based on the notion of similarity between items. An instance of the Correlation Clustering problem consists of a graph $G$ (not necessarily complete)…
In the graph clustering problem with a planted solution, the input is a graph on $n$ vertices partitioned into $k$ clusters, and the task is to infer the clusters from graph structure. A standard assumption is that clusters induce…
In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we…
The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve the cluster quality and promote…
In this paper, we study parallel algorithms for the correlation clustering problem, where every pair of two different entities is labeled with similar or dissimilar. The goal is to partition the entities into clusters to minimize the number…