Related papers: Generalizing Bottleneck Problems
Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong data…
We study two dual settings of information processing. Let $ \mathsf{Y} \rightarrow \mathsf{X} \rightarrow \mathsf{W} $ be a Markov chain with fixed joint probability mass function $ \mathsf{P}_{\mathsf{X}\mathsf{Y}} $ and a mutual…
Suppose $X$ is a uniformly distributed $n$-dimensional binary vector and $Y$ is obtained by passing $X$ through a binary symmetric channel with crossover probability $\alpha$. A recent conjecture by Courtade and Kumar postulates that…
Combining the Information Bottleneck model with deep learning by replacing mutual information terms with deep neural nets has proved successful in areas ranging from generative modelling to interpreting deep neural networks. In this paper,…
We prove rigorously a source coding theorem that can probably be considered folklore, a generalization to arbitrary alphabets of a problem motivated by the Information Bottleneck method. For general random variables $(Y, X)$, we show…
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from…
We view the Information Bottleneck Principle (IBP: Tishby et al., 1999; Schwartz-Ziv and Tishby, 2017) and Predictive Information Bottleneck Principle (PIBP: Still et al., 2007; Alemi, 2019) as special cases of a family of general…
Previous work has shown that DNNs with large depth $L$ and $L_{2}$-regularization are biased towards learning low-dimensional representations of the inputs, which can be interpreted as minimizing a notion of rank $R^{(0)}(f)$ of the learned…
The families of $f$-divergences (e.g. the Kullback-Leibler divergence) and Integral Probability Metrics (e.g. total variation distance or maximum mean discrepancies) are widely used to quantify the similarity between probability…
We formulate and analyze the compound information bottleneck programming. In this problem, a Markov chain $ \mathsf{X} \rightarrow \mathsf{Y} \rightarrow \mathsf{Z} $ is assumed with fixed marginal distributions $\mathsf{P}_{\mathsf{X}}$…
In classic papers, Zellner demonstrated that Bayesian inference could be derived as the solution to an information theoretic functional. Below we derive a generalized form of this functional as a variational lower bound of a predictive…
This paper consists of two halves. In the first half of the paper, we consider real-valued functions $f$ whose domain is the vertex set of a graph $G$ and that are Lipschitz with respect to the graph distance. By placing a uniform…
We establish lower bounds on the volume and the surface area of a geometric body using the size of its slices along different directions. In the first part of the paper, we derive volume bounds for convex bodies using generalized…
Information Bottlenecks (IBs) learn representations that generalize to unseen data by information compression. However, existing IBs are practically unable to guarantee generalization in real-world scenarios due to the vacuous…
This paper focuses on parameter estimation and introduces a new method for lower bounding the Bayesian risk. The method allows for the use of virtually \emph{any} information measure, including R\'enyi's $\alpha$, $\varphi$-Divergences, and…
We prove the Courtade-Kumar conjecture, which states that the mutual information between any Boolean function of an $n$-dimensional vector of independent and identically distributed inputs to a memoryless binary symmetric channel and the…
In the context of statistical learning, the Information Bottleneck method seeks a right balance between accuracy and generalization capability through a suitable tradeoff between compression complexity, measured by minimum description…
In decision-making problems under uncertainty, probabilistic constraints are a valuable tool to express safety of decisions. They result from taking the probability measure of a given set of random inequalities depending on the decision…
We introduce a bottleneck method for learning data representations based on information deficiency, rather than the more traditional information sufficiency. A variational upper bound allows us to implement this method efficiently. The…
Information bottleneck (IB) is a method for extracting information from one random variable $X$ that is relevant for predicting another random variable $Y$. To do so, IB identifies an intermediate "bottleneck" variable $T$ that has low…