English

Nonlinear Information Bottleneck

Information Theory 2022-11-22 v9 Machine Learning math.IT Machine Learning

Abstract

Information bottleneck (IB) is a technique for extracting information in one random variable XX that is relevant for predicting another random variable YY. IB works by encoding XX in a compressed "bottleneck" random variable MM from which YY can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete XX and YY with small state spaces, and continuous XX and YY with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous XX and YY, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.

Keywords

Cite

@article{arxiv.1705.02436,
  title  = {Nonlinear Information Bottleneck},
  author = {Artemy Kolchinsky and Brendan D. Tracey and David H. Wolpert},
  journal= {arXiv preprint arXiv:1705.02436},
  year   = {2022}
}