English

Stable Sparse Subspace Embedding for Dimensionality Reduction

Machine Learning 2020-02-10 v1 Machine Learning

Abstract

Sparse random projection (RP) is a popular tool for dimensionality reduction that shows promising performance with low computational complexity. However, in the existing sparse RP matrices, the positions of non-zero entries are usually randomly selected. Although they adopt uniform sampling with replacement, due to large sampling variance, the number of non-zeros is uneven among rows of the projection matrix which is generated in one trial, and more data information may be lost after dimension reduction. To break this bottleneck, based on random sampling without replacement in statistics, this paper builds a stable sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly distributed. It is proved that the S-SSE is stabler than the existing matrix, and it can maintain Euclidean distance between points well after dimension reduction. Our empirical studies corroborate our theoretical findings and demonstrate that our approach can indeed achieve satisfactory performance.

Keywords

Cite

@article{arxiv.2002.02844,
  title  = {Stable Sparse Subspace Embedding for Dimensionality Reduction},
  author = {Li Chen and Shuizheng Zhou and Jiajun Ma},
  journal= {arXiv preprint arXiv:2002.02844},
  year   = {2020}
}