Related papers: Spread Divergence
Real-life data are often non-IID due to complex distributions and interactions, and the sensitivity to the distribution of samples can differ among learning models. Accordingly, a key question for any supervised or unsupervised model is…
Finite precision approximations of discrete probability distributions are considered, applicable for distribution synthesis, e.g., probabilistic shaping. Two algorithms are presented that find the optimal $M$-type approximation $Q$ of a…
The purpose of unconditional text generation is to train a model with real sentences, then generate novel sentences of the same quality and diversity as the training data. However, when different metrics are used for comparing the methods…
Diffusion models are one of the most important families of deep generative models. In this note, we derive a quantitative upper bound on the Wasserstein distance between the data-generating distribution and the distribution learned by a…
This work presents an upper-bound to value that the Kullback-Leibler (KL) divergence can reach for a class of probability distributions called quantum distributions (QD). The aim is to find a distribution $U$ which maximizes the KL…
We introduce a new discrepancy score between two distributions that gives an indication on their similarity. While much research has been done to determine if two samples come from exactly the same distribution, much less research…
Detecting weak, systematic distribution shifts and quantitatively modeling individual, heterogeneous responses to policies or incentives have found increasing empirical applications in social and economic sciences. Given two probability…
Targeting to understand the underlying explainable factors behind observations and modeling the conditional generation process on these factors, we connect disentangled representation learning to Diffusion Probabilistic Models (DPMs) to…
Discriminative linear models are a popular tool in machine learning. These can be generally divided into two types: The first is linear classifiers, such as support vector machines, which are well studied and provide state-of-the-art…
We study the question of testing structured properties (classes) of discrete distributions. Specifically, given sample access to an arbitrary distribution $D$ over $[n]$ and a property $\mathcal{P}$, the goal is to distinguish between…
We provide a distribution-free test that can be used to determine whether any two joint distributions $p$ and $q$ are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we…
In this short note, we study the distribution of spreads in a point set $\mathcal{P} \subseteq \mathbb{F}_q^d$, which are analogous to angles in Euclidean space. More precisely, we prove that, for any $\varepsilon > 0$, if $|\mathcal{P}|…
Many machine learning models appear to deploy effortlessly under distribution shift, and perform well on a target distribution that is considerably different from the training distribution. Yet, learning theory of distribution shift bounds…
`Distribution regression' refers to the situation where a response Y depends on a covariate P where P is a probability distribution. The model is Y=f(P) + mu where f is an unknown regression function and mu is a random error. Typically, we…
Influence overlap is a universal phenomenon in influence spreading for social networks. In this paper, we argue that the redundant influence generated by influence overlap cause negative effect for maximizing spreading influence. Firstly,…
We derive an (almost) guaranteed upper bound on the error of deep neural networks under distribution shift using unlabeled test data. Prior methods either give bounds that are vacuous in practice or give estimates that are accurate on…
Out-of-distribution (OOD) detection is an important task in machine learning systems for ensuring their reliability and safety. Deep probabilistic generative models facilitate OOD detection by estimating the likelihood of a data sample.…
Out-of-distribution (OOD) detection is crucial for deploying robust machine learning models, especially in areas where security is critical. However, traditional OOD detection methods often fail to capture complex data distributions from…
Introducing angular dispersion into a pulsed field associates each frequency with a particular angle with respect to the propagation axis. A perennial yet implicit assumption is that the propagation angle is differentiable with respect to…
Out-of-distribution (OOD) detection is critical for ensuring the reliability of deep learning systems, particularly in safety-critical applications. Likelihood-based deep generative models have historically faced criticism for their…