English

Universal Regular Conditional Distributions

Machine Learning 2023-02-24 v5 Neural and Evolutionary Computing Metric Geometry Probability Machine Learning

Abstract

We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space X\mathcal{X} to Rd\mathbb{R}^d via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the 11-Wasserstein space P1(RD)\mathcal{P}_1(\mathbb{R}^D) via a probabilistic extension of the attention mechanism of Bahdanau et al.\ (2014). Our model, called the \textit{probabilistic transformer (PT)}, can approximate any continuous function from Rd\mathbb{R}^d to P1(RD)\mathcal{P}_1(\mathbb{R}^D) uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating P1(RD)\mathcal{P}_1(\mathbb{R}^D)-valued functions. The first strategy builds functions in C(Rd,P1(RD))C(\mathbb{R}^d,\mathcal{P}_1(\mathbb{R}^D)) which can be efficiently approximated by a PT, uniformly on any given compact subset of Rd\mathbb{R}^d. In the second approach, given any function ff in C(Rd,P1(RD))C(\mathbb{R}^d,\mathcal{P}_1(\mathbb{R}^D)), we build compact subsets of Rd\mathbb{R}^d whereon ff can be efficiently approximated by a PT.

Keywords

Cite

@article{arxiv.2105.07743,
  title  = {Universal Regular Conditional Distributions},
  author = {Anastasis Kratsios},
  journal= {arXiv preprint arXiv:2105.07743},
  year   = {2023}
}

Comments

Regular Conditional Distributions, Geometric Deep Learning, Computational Optimal Transport, Measure-Valued Neural Networks, Universal Approximation, Transformers