English

Adaptive Clustering through Semidefinite Programming

Statistics Theory 2017-05-19 v1 Statistics Theory

Abstract

We analyze the clustering problem through a flexible probabilistic model that aims to identify an optimal partition on the sample X 1 , ..., X n. We perform exact clustering with high probability using a convex semidefinite estimator that interprets as a corrected, relaxed version of K-means. The estimator is analyzed through a non-asymptotic framework and showed to be optimal or near-optimal in recovering the partition. Furthermore, its performances are shown to be adaptive to the problem's effective dimension, as well as to K the unknown number of groups in this partition. We illustrate the method's performances in comparison to other classical clustering algorithms with numerical experiments on simulated data.

Keywords

Cite

@article{arxiv.1705.06615,
  title  = {Adaptive Clustering through Semidefinite Programming},
  author = {Martin Royer},
  journal= {arXiv preprint arXiv:1705.06615},
  year   = {2017}
}