Embedding Compression via Spherical Coordinates
Abstract
We present an -bounded compression method for unit-norm embeddings that achieves 1.5 compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around , causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon (), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks.
Keywords
Cite
@article{arxiv.2602.00079,
title = {Embedding Compression via Spherical Coordinates},
author = {Han Xiao},
journal= {arXiv preprint arXiv:2602.00079},
year = {2026}
}
Comments
Accepted at ICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling (GRaM). 13 pages, 2 figures. Code: https://github.com/jina-ai/jzip