English

The Gaussian Transform

Machine Learning 2020-06-23 v1 Machine Learning

Abstract

We introduce the Gaussian transform (GT), an optimal transport inspired iterative method for denoising and enhancing latent structures in datasets. Under the hood, GT generates a new distance function (GT distance) on a given dataset by computing the 2\ell^2-Wasserstein distance between certain Gaussian density estimates obtained by localizing the dataset to individual points. Our contribution is twofold: (1) theoretically, we establish firstly that GT is stable under perturbations and secondly that in the continuous case, each point possesses an asymptotically ellipsoidal neighborhood with respect to the GT distance; (2) computationally, we accelerate GT both by identifying a strategy for reducing the number of matrix square root computations inherent to the 2\ell^2-Wasserstein distance between Gaussian measures, and by avoiding redundant computations of GT distances between points via enhanced neighborhood mechanisms. We also observe that GT is both a generalization and a strengthening of the mean shift (MS) method, and it is also a computationally efficient specialization of the recently proposed Wasserstein Transform (WT) method. We perform extensive experimentation comparing their performance in different scenarios.

Keywords

Cite

@article{arxiv.2006.11698,
  title  = {The Gaussian Transform},
  author = {Kun Jin and Facundo Mémoli and Zhengchao Wan},
  journal= {arXiv preprint arXiv:2006.11698},
  year   = {2020}
}
R2 v1 2026-06-23T16:29:29.751Z