Related papers: Subspace Embeddings Under Nonlinear Transformation…

Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix $A \in \R^{n \times d}$ with $n \gg d$ and a $p \in [1, 2)$,…

Data Structures and Algorithms · Computer Science 2013-03-22 Xiangrui Meng , Michael W. Mahoney

Terminal Embeddings in Sublinear Time

Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a set of designated terminals $T\subset X$. Such an embedding $f$ is said to have…

Data Structures and Algorithms · Computer Science 2024-08-07 Yeshwanth Cherapanamjeri , Jelani Nelson

Subspace Embeddings and $\ell_p$-Regression Using Exponential Random Variables

Oblivious low-distortion subspace embeddings are a crucial building block for numerical linear algebra problems. We show for any real $p, 1 \leq p < \infty$, given a matrix $M \in \mathbb{R}^{n \times d}$ with $n \gg d$, with constant…

Data Structures and Algorithms · Computer Science 2014-03-19 David P. Woodruff , Qin Zhang

Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity

An oblivious subspace embedding is a random $m\times n$ matrix $\Pi$ such that, for any $d$-dimensional subspace, with high probability $\Pi$ preserves the norms of all vectors in that subspace within a $1\pm\epsilon$ factor. In this work,…

Data Structures and Algorithms · Computer Science 2025-04-30 Shabarish Chenakkod , Michał Dereziński , Xiaoyu Dong

Sparse Dimensionality Reduction Revisited

The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of $n$ points in $\mathbb{R}^d$ into $m=O(\varepsilon^{-2} \lg n)$ dimensions while preserving all pairwise…

Data Structures and Algorithms · Computer Science 2023-02-14 Mikael Møller Høgsgaard , Lion Kamma , Kasper Green Larsen , Jelani Nelson , Chris Schwiegelshohn

On Probabilistic Embeddings in Optimal Dimension Reduction

Dimension reduction algorithms are a crucial part of many data science pipelines, including data exploration, feature creation and selection, and denoising. Despite their wide utilization, many non-linear dimension reduction algorithms are…

Machine Learning · Statistics 2024-08-06 Ryan Murray , Adam Pickarski

Small Width, Low Distortions: Quantized Random Embeddings of Low-complexity Sets

Under which conditions and with which distortions can we preserve the pairwise-distances of low-complexity vectors, e.g., for structured sets such as the set of sparse vectors or the one of low-rank matrices, when these are mapped in a…

Information Theory · Computer Science 2016-11-15 Laurent Jacques

Lower Memory Oblivious (Tensor) Subspace Embeddings with Fewer Random Bits: Modewise Methods for Least Squares

In this paper new general modewise Johnson-Lindenstrauss (JL) subspace embeddings are proposed that are both considerably faster to generate and easier to store than traditional JL embeddings when working with extremely large vectors and/or…

Numerical Analysis · Mathematics 2020-12-18 M. A. Iwen , D. Needell , E. Rebrova , A. Zare

Small Transformers Compute Universal Metric Embeddings

We study representations of data from an arbitrary metric space $\mathcal{X}$ in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by…

Machine Learning · Computer Science 2023-10-17 Anastasis Kratsios , Valentin Debarnot , Ivan Dokmanić

Fast Nearest Neighbor Preserving Embeddings

We show an analog to the Fast Johnson-Lindenstrauss Transform for Nearest Neighbor Preserving Embeddings in $\ell_2$. These are sparse, randomized embeddings that preserve the (approximate) nearest neighbors. The dimensionality of the…

Data Structures and Algorithms · Computer Science 2017-07-24 Johan Sivertsen

Johnson-Lindenstrauss embeddings for noisy vectors -- taking advantage of the noise

This paper investigates theoretical properties of subsampling and hashing as tools for approximate Euclidean norm-preserving embeddings for vectors with (unknown) additive Gaussian noises. Such embeddings are sometimes called…

Data Structures and Algorithms · Computer Science 2022-09-05 Zhen Shao

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation. This procedure conditions the…

Machine Learning · Computer Science 2022-10-04 Samuel Lavoie , Christos Tsirigotis , Max Schwarzer , Ankit Vani , Michael Noukhovitch , Kenji Kawaguchi , Aaron Courville

Optimal Embedding Dimension for Sparse Subspace Embeddings

A random $m\times n$ matrix $S$ is an oblivious subspace embedding (OSE) with parameters $\epsilon>0$, $\delta\in(0,1/3)$ and $d\leq m\leq n$, if for any $d$-dimensional subspace $W\subseteq R^n$, $P\big(\,\forall_{x\in W}\…

Data Structures and Algorithms · Computer Science 2025-11-18 Shabarish Chenakkod , Michał Dereziński , Xiaoyu Dong , Mark Rudelson

On the Effect of Misspecifying the Embedding Dimension in Low-rank Network Models

As network data has become ubiquitous in the sciences, there has been growing interest in network models whose structure is driven by latent node-level variables in a (typically low-dimensional) latent geometric space. These "latent…

Statistics Theory · Mathematics 2026-01-12 Roddy Taing , Keith Levin

Subspace Representations for Soft Set Operations and Sentence Similarities

In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words. Yet, when it comes to the representations of sets of words, the conventional…

Computation and Language · Computer Science 2024-04-11 Yoichi Ishibashi , Sho Yokoi , Katsuhito Sudoh , Satoshi Nakamura

Lossless Prioritized Embeddings

Given metric spaces $(X,d)$ and $(Y,\rho)$ and an ordering $x_1,x_2,\ldots,x_n$ of $(X,d)$, an embedding $f: X \rightarrow Y$ is said to have a prioritized distortion $\alpha(\cdot)$, if for any pair $x_j,x'$ of distinct points in $X$, the…

Data Structures and Algorithms · Computer Science 2019-07-17 Michael Elkin , Ofer Neiman

A Process for the Evaluation of Node Embedding Methods in the Context of Node Classification

Node embedding methods find latent lower-dimensional representations which are used as features in machine learning models. In the last few years, these methods have become extremely popular as a replacement for manual feature engineering.…

Social and Information Networks · Computer Science 2020-06-01 Christoph Martin , Meike Riebeling

Lightweight Adaptation of Neural Language Models via Subspace Embedding

Traditional neural word embeddings are usually dependent on a richer diversity of vocabulary. However, the language models recline to cover major vocabularies via the word embedding parameters, in particular, for multilingual language…

Computation and Language · Computer Science 2023-08-21 Amit Kumar Jaiswal , Haiming Liu

Fast Fixed Dimension L2-Subspace Embeddings of Arbitrary Accuracy, With Application to L1 and L2 Tasks

We give a fast oblivious L2-embedding of $A\in \mathbb{R}^{n x d}$ to $B\in \mathbb{R}^{r x d}$ satisfying $(1-\varepsilon)\|A x\|_2^2 \le \|B x\|_2^2 <= (1+\varepsilon) \|Ax\|_2^2.$ Our embedding dimension $r$ equals $d$, a constant…

Machine Learning · Computer Science 2019-09-30 Malik Magdon-Ismail , Alex Gittens

On the Dimensionality of Embeddings for Sparse Features and Data

In this note we discuss a common misconception, namely that embeddings are always used to reduce the dimensionality of the item space. We show that when we measure dimensionality in terms of information entropy then the embedding of sparse…

Machine Learning · Computer Science 2019-01-09 Maxim Naumov