Binary Embedding: Fundamental Limits and Fast Algorithm
Abstract
Binary embedding is a nonlinear dimension reduction methodology where high dimensional data are embedded into the Hamming cube while preserving the structure of the original space. Specifically, for an arbitrary distinct points in , our goal is to encode each point using -dimensional binary strings such that we can reconstruct their geodesic distance up to uniform distortion. Existing binary embedding algorithms either lack theoretical guarantees or suffer from running time . We make three contributions: (1) we establish a lower bound that shows any binary embedding oblivious to the set of points requires bits and a similar lower bound for non-oblivious embeddings into Hamming distance; (2) [DELETED, see comment]; (3) we also provide an analytic result about embedding a general set of points with even infinite size. Our theoretical findings are supported through experiments on both synthetic and real data sets.
Cite
@article{arxiv.1502.05746,
title = {Binary Embedding: Fundamental Limits and Fast Algorithm},
author = {Xinyang Yi and Constantine Caramanis and Eric Price},
journal= {arXiv preprint arXiv:1502.05746},
year = {2019}
}
Comments
Note: the previous version of this paper also included a claimed fast upper bound for certain parameter regimes. The proof of this had an error, as pointed out in Dirksen and Stollenwerk (2018); the same paper also presents a correct algorithm for the setting