English

Linear Time Algorithm for Projective Clustering

Computational Geometry 2015-03-20 v2

Abstract

Projective clustering is a problem with both theoretical and practical importance and has received a great deal of attentions in recent years. Given a set of points PP in Rd\mathbb{R}^{d} space, projective clustering is to find a set F\mathbb{F} of kk lower dimensional jj-flats so that the average distance (or squared distance) from points in PP to their closest flats is minimized. Existing approaches for this problem are mainly based on adaptive/volume sampling or core-sets techniques which suffer from several limitations. In this paper, we present the first uniform random sampling based approach for this challenging problem and achieve linear time solutions for three cases, general projective clustering, regular projective clustering, and LτL_{\tau} sense projective clustering. For the general projective clustering problem, we show that for any given small numbers 0<γ,ϵ<10<\gamma, \epsilon <1, our approach first removes γP\gamma|P| points as outliers and then determines kk jj-flats to cluster the remaining points into kk clusters with an objective value no more than (1+ϵ)(1+\epsilon) times of the optimal for all points. For regular projective clustering, we demonstrate that when the input points satisfy some reasonable assumption on its input, our approach for the general case can be extended to yield a PTAS for all points. For LτL_{\tau} sense projective clustering, we show that our techniques for both the general and regular cases can be naturally extended to the LτL_{\tau} sense projective clustering problem for any 1τ<1 \le \tau < \infty. Our results are based on several novel techniques, such as slab partition, Δ\Delta-rotation, symmetric sampling, and recursive projection, and can be easily implemented for applications.

Keywords

Cite

@article{arxiv.1204.6717,
  title  = {Linear Time Algorithm for Projective Clustering},
  author = {Hu Ding and Jinhui Xu},
  journal= {arXiv preprint arXiv:1204.6717},
  year   = {2015}
}

Comments

22 pages, 8 figures

R2 v1 2026-06-21T20:56:45.225Z