Embedding Compression via Spherical Coordinates

Han Xiao

Embedding Compression via Spherical Coordinates

Machine Learning 2026-03-27 v4 Computer Vision and Pattern Recognition

Authors: Han Xiao

Abstract

We present an $\epsilon$ -bounded compression method for unit-norm embeddings that achieves 1.5 $\times$ compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around $\pi/2$ , causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon ( $1.19 \times 10^{-7}$ ), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks.

Keywords

source coding image compression metric space

Cite

@article{arxiv.2602.00079,
  title  = {Embedding Compression via Spherical Coordinates},
  author = {Han Xiao},
  journal= {arXiv preprint arXiv:2602.00079},
  year   = {2026}
}

Comments

Accepted at ICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling (GRaM). 13 pages, 2 figures. Code: https://github.com/jina-ai/jzip

Embedding Compression via Spherical Coordinates

Abstract

Keywords

Cite

Comments

Related papers