Related papers: The Conditional Entropy Bottleneck

CEB Improves Model Robustness

We demonstrate that the Conditional Entropy Bottleneck (CEB) can improve model robustness. CEB is an easy strategy to implement and works in tandem with data augmentation procedures. We report results of a large scale adversarial robustness…

Machine Learning · Computer Science 2020-10-28 Ian Fischer , Alexander A. Alemi

Training Normalizing Flows with the Information Bottleneck for Competitive Generative Classification

The Information Bottleneck (IB) objective uses information theory to formulate a task-performance versus robustness trade-off. It has been successfully applied in the standard discriminative classification setting. We pose the question…

Machine Learning · Computer Science 2021-01-13 Lynton Ardizzone , Radek Mackowiak , Carsten Rother , Ullrich Köthe

Tighter Bounds on the Information Bottleneck with Application to Deep Learning

Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters. The quality of the learned representations impacts the DNN's generalization ability and the coherence of the…

Machine Learning · Computer Science 2024-02-13 Nir Weingarten , Zohar Yakhini , Moshe Butman , Ran Gilad-Bachrach

A Generalized Information Bottleneck Theory of Deep Learning

The Information Bottleneck (IB) principle offers a compelling theoretical framework to understand how neural networks (NNs) learn. However, its practical utility has been constrained by unresolved theoretical ambiguities and significant…

Machine Learning · Computer Science 2026-02-02 Charles Westphal , Stephen Hailes , Mirco Musolesi

Information Bottleneck Analysis of Deep Neural Networks via Lossy Compression

The Information Bottleneck (IB) principle offers an information-theoretic framework for analyzing the training process of deep neural networks (DNNs). Its essence lies in tracking the dynamics of two mutual information (MI) values: between…

Machine Learning · Computer Science 2024-05-10 Ivan Butakov , Alexander Tolmachev , Sofia Malanchuk , Anna Neopryatnaya , Alexey Frolov , Kirill Andreev

There Was Never a Bottleneck in Concept Bottleneck Models

Deep learning representations are often difficult to interpret, which can hinder their deployment in sensitive applications. Concept Bottleneck Models (CBMs) have emerged as a promising approach to mitigate this issue by learning…

Machine Learning · Computer Science 2026-01-30 Antonio Almudévar , José Miguel Hernández-Lobato , Alfonso Ortega

Unlearning Information Bottleneck: Machine Unlearning of Systematic Patterns and Biases

Effective adaptation to distribution shifts in training data is pivotal for sustaining robustness in neural networks, especially when removing specific biases or outdated information, a process known as machine unlearning. Traditional…

Machine Learning · Computer Science 2024-05-24 Ling Han , Hao Huang , Dustin Scheinost , Mary-Anne Hartley , María Rodríguez Martínez

Correlation Information Bottleneck: Towards Adapting Pretrained Multimodal Models for Robust Visual Question Answering

Benefiting from large-scale pretrained vision language models (VLMs), the performance of visual question answering (VQA) has approached human oracles. However, finetuning such models on limited data often suffers from overfitting and poor…

Computer Vision and Pattern Recognition · Computer Science 2023-05-09 Jingjing Jiang , Ziyi Liu , Nanning Zheng

Layer-wise Learning of Stochastic Neural Networks with Information Bottleneck

Information Bottleneck (IB) is a generalization of rate-distortion theory that naturally incorporates compression and relevance trade-offs for learning. Though the original IB has been extensively studied, there has not been much…

Machine Learning · Computer Science 2019-10-08 Thanh T. Nguyen , Jaesik Choi

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to achieve state-of-the-art performances in many applications, and a key is the existence of well trained classifiers on benchmark datasets. As a main-stream loss function, the cross entropy can…

Machine Learning · Computer Science 2022-09-22 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Elastic Information Bottleneck

Information bottleneck is an information-theoretic principle of representation learning that aims to learn a maximally compressed representation that preserves as much information about labels as possible. Under this principle, two…

Information Theory · Computer Science 2023-11-08 Yuyan Ni , Yanyan Lan , Ao Liu , Zhiming Ma

General Information Bottleneck Objectives and their Applications to Machine Learning

We view the Information Bottleneck Principle (IBP: Tishby et al., 1999; Schwartz-Ziv and Tishby, 2017) and Predictive Information Bottleneck Principle (PIBP: Still et al., 2007; Alemi, 2019) as special cases of a family of general…

Machine Learning · Computer Science 2019-12-24 Sayandev Mukherjee

Learning Optimal Multimodal Information Bottleneck Representations

Leveraging high-quality joint representations from multimodal data can greatly enhance model performance in various machine-learning based applications. Recent multimodal learning methods, based on the multimodal information bottleneck…

Machine Learning · Computer Science 2025-05-27 Qilong Wu , Yiyang Shao , Jun Wang , Xiaobo Sun

Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle

In this theory paper, we investigate training deep neural networks (DNNs) for classification via minimizing the information bottleneck (IB) functional. We show that the resulting optimization problem suffers from two severe issues: First,…

Machine Learning · Computer Science 2020-08-10 Rana Ali Amjad , Bernhard C. Geiger

A Critical Review of Information Bottleneck Theory and its Applications to Deep Learning

In the past decade, deep neural networks have seen unparalleled improvements that continue to impact every aspect of today's society. With the development of high performance GPUs and the availability of vast amounts of data, learning…

Machine Learning · Computer Science 2021-05-12 Mohammad Ali Alomrani

Counterfactual Supervision-based Information Bottleneck for Out-of-Distribution Generalization

Learning invariant (causal) features for out-of-distribution (OOD) generalization has attracted extensive attention recently, and among the proposals invariant risk minimization (IRM) is a notable solution. In spite of its theoretical…

Machine Learning · Computer Science 2023-02-01 Bin Deng , Kui Jia

Gated Information Bottleneck for Generalization in Sequential Environments

Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set. By learning minimum sufficient representations from training data, the information…

Machine Learning · Computer Science 2021-10-13 Francesco Alesiani , Shujian Yu , Xi Yu

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to acheive state-of-the-art performances in many applications, and one of the keys for achieving this is the existence of well trained classifiers on benchmark datasets which can be used as backbone…

Machine Learning · Computer Science 2022-10-04 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal Representations

Learning effective joint embedding for cross-modal data has always been a focus in the field of multimodal machine learning. We argue that during multimodal fusion, the generated multimodal embedding may be redundant, and the discriminative…

Machine Learning · Computer Science 2022-12-06 Sijie Mai , Ying Zeng , Haifeng Hu

Transfer Entropy Bottleneck: Learning Sequence to Sequence Information Transfer

When presented with a data stream of two statistically dependent variables, predicting the future of one of the variables (the target stream) can benefit from information about both its history and the history of the other variable (the…

Machine Learning · Computer Science 2023-03-10 Damjan Kalajdzievski , Ximeng Mao , Pascal Fortier-Poisson , Guillaume Lajoie , Blake Richards