Related papers: Maximizing Multi-Information
In this paper we propose a Bayesian, information theoretic approach to dimensionality reduction. The approach is formulated as a variational principle on mutual information, and seamlessly addresses the notions of sufficiency, relevance,…
Inferring and comparing complex, multivariable probability density functions is fundamental to problems in several fields, including probabilistic learning, network theory, and data analysis. Classification and prediction are the two faces…
We investigate the sets of joint probability distributions that maximize the average multi-information over a collection of margins. These functionals serve as proxies for maximizing the multi-information of a set of variables or the mutual…
This paper investigates maximizers of the information divergence from an exponential family $E$. It is shown that the $rI$-projection of a maximizer $P$ to $E$ is a convex combination of $P$ and a probability measure $P_-$ with disjoint…
We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of phi-mutual informations, associated to phi-divergences between a joint distribution and the product distribution…
Multivariate mutual information provides a conceptual framework for characterizing higher-order interactions in complex systems. Two well-known measures of multivariate information---total correlation and dual total correlation---admit a…
The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for…
We consider the parameter estimation problem of a probabilistic generative model prescribed using a natural exponential family of distributions. For this problem, the typical maximum likelihood estimator usually overfits under limited…
We study the worst-case probability that $Y$ outperforms a benchmark $X$ when the law of $Y$ lies in a Kullback-Leibler neighbourhood of the benchmark. The max-min problem over couplings admits a tractable dual (via optimal transport),…
This article studies exponential families $\mathcal{E}$ on finite sets such that the information divergence $D(P\|\mathcal{E})$ of an arbitrary probability distribution from $\mathcal{E}$ is bounded by some constant $D>0$. A particular…
Mutual Information (MI) is a fundamental measure of statistical dependence widely used in representation learning. While direct optimization of MI via its definition as a Kullback-Leibler divergence (KLD) is often intractable, many recent…
Welfare maximization in bilateral trade has been extensively studied in recent years. Previous literature obtained incentive-compatible approximation mechanisms only for the private values case. In this paper, we study welfare maximization…
Distributed learning of probabilistic models from multiple data repositories with minimum communication is increasingly important. We study a simple communication-efficient learning framework that first calculates the local maximum…
The integrated information theory is thought to be a key clue towards the theoretical understanding of consciousness. In this study, we propose a simple numerical model comprising a set of coupled double quantum dots, where the…
Mutual information $I(X;Y)$ is a useful definition in information theory to estimate how much information the random variable $Y$ holds about the random variable $X$. One way to define the mutual information is by comparing the joint…
Bayesian networks are one of the most widely used classes of probabilistic models for risk management and decision support because of their interpretability and flexibility in including heterogeneous pieces of information. In any applied…
For a parametric model of distributions, the closest distribution in the model to the true distribution located outside the model is considered. Measuring the closeness between two distributions with the Kullback-Leibler (K-L) divergence,…
We show that the predicted probability distributions for any $N$-parameter statistical model taking the form of an exponential family can be explicitly and analytically embedded isometrically in a $N{+}N$-dimensional Minkowski space. That…
We present theoretical properties of the log-concave maximum likelihood estimator of a density based on an independent and identically distributed sample in $\mathbb{R}^d$. Our study covers both the case where the true underlying density is…
Formalising the confrontation of opinions (models) to observations (data) is the task of Inferential Statistics. Information Theory provides us with a basic functional, the relative entropy (or Kullback-Leibler divergence), an asymmetrical…