English

Correlation Clustering Generalized

Computational Complexity 2018-09-26 v1 Data Structures and Algorithms

Abstract

We present new results for LambdaCC and MotifCC, two recently introduced variants of the well-studied correlation clustering problem. Both variants are motivated by applications to network analysis and community detection, and have non-trivial approximation algorithms. We first show that the standard linear programming relaxation of LambdaCC has a Θ(logn)\Theta(\log n) integrality gap for a certain choice of the parameter λ\lambda. This sheds light on previous challenges encountered in obtaining parameter-independent approximation results for LambdaCC. We generalize a previous constant-factor algorithm to provide the best results, from the LP-rounding approach, for an extended range of λ\lambda. MotifCC generalizes correlation clustering to the hypergraph setting. In the case of hyperedges of degree 33 with weights satisfying probability constraints, we improve the best approximation factor from 99 to 88. We show that in general our algorithm gives a 4(k1)4(k-1) approximation when hyperedges have maximum degree kk and probability weights. We additionally present approximation results for LambdaCC and MotifCC where we restrict to forming only two clusters.

Keywords

Cite

@article{arxiv.1809.09493,
  title  = {Correlation Clustering Generalized},
  author = {David F. Gleich and Nate Veldt and Anthony Wirth},
  journal= {arXiv preprint arXiv:1809.09493},
  year   = {2018}
}