Sparse Dimensionality Reduction Revisited
Abstract
The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of points in into dimensions while preserving all pairwise distances to within . Each input point is embedded to , where is an matrix having non-zeros per column, allowing for an embedding time of . Since the sparsity of governs the embedding time, much work has gone into improving the sparsity . The current state-of-the-art by Kane and Nelson (JACM'14) shows that suffices. This is almost matched by a lower bound of by Nelson and Nguyen (STOC'13). Previous work thus suggests that we have near-optimal embeddings. In this work, we revisit sparse embeddings and identify a loophole in the lower bound. Concretely, it requires , which in many applications is unrealistic. We exploit this loophole to give a sparser embedding when , achieving . We also complement our analysis by strengthening the lower bound of Nelson and Nguyen to hold also when , thereby matching the first term in our new sparsity upper bound. Finally, we also improve the sparsity of the best oblivious subspace embeddings for optimal embedding dimensionality.
Cite
@article{arxiv.2302.06165,
title = {Sparse Dimensionality Reduction Revisited},
author = {Mikael Møller Høgsgaard and Lion Kamma and Kasper Green Larsen and Jelani Nelson and Chris Schwiegelshohn},
journal= {arXiv preprint arXiv:2302.06165},
year = {2023}
}