Related papers: Understanding Encoder-Decoder Structures in Machin…

High Mutual Information in Representation Learning with Symmetric Variational Inference

We introduce the Mutual Information Machine (MIM), a novel formulation of representation learning, using a joint distribution over the observations and latent state in an encoder/decoder framework. Our key principles are symmetry and mutual…

Machine Learning · Statistics 2019-10-10 Micha Livne , Kevin Swersky , David J. Fleet

Dynamic Encoding and Decoding of Information for Split Learning in Mobile-Edge Computing: Leveraging Information Bottleneck Theory

Split learning is a privacy-preserving distributed learning paradigm in which an ML model (e.g., a neural network) is split into two parts (i.e., an encoder and a decoder). The encoder shares so-called latent representation, rather than raw…

Machine Learning · Computer Science 2023-09-07 Omar Alhussein , Moshi Wei , Arashmid Akhavain

MIM: Mutual Information Machine

We introduce the Mutual Information Machine (MIM), a probabilistic auto-encoder for learning joint distributions over observations and latent variables. MIM reflects three design principles: 1) low divergence, to encourage the encoder and…

Machine Learning · Computer Science 2020-02-24 Micha Livne , Kevin Swersky , David J. Fleet

Minimum Description Length and Generalization Guarantees for Representation Learning

A major challenge in designing efficient statistical supervised learning algorithms is finding representations that perform well not only on available training samples but also on unseen data. While the study of representation learning has…

Machine Learning · Statistics 2024-02-06 Milad Sefidgaran , Abdellatif Zaidi , Piotr Krasnowski

The Role of Information Complexity and Randomization in Representation Learning

A grand challenge in representation learning is to learn the different explanatory factors of variation behind the high dimen- sional data. Encoder models are often determined to optimize performance on training data when the real objective…

Machine Learning · Statistics 2018-02-16 Matías Vera , Pablo Piantanida , Leonardo Rey Vega

An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation

The traditional SegNet architecture commonly encounters significant information loss during the sampling process, which detrimentally affects its accuracy in image semantic segmentation tasks. To counter this challenge, we introduce an…

Image and Video Processing · Electrical Eng. & Systems 2024-06-05 Zijun Gao , Qi Wang , Taiyuan Mei , Xiaohan Cheng , Yun Zi , Haowei Yang

Learning Speaker Representations with Mutual Information

Learning good representations is of crucial importance in deep learning. Mutual Information (MI) or similar measures of statistical dependence are promising tools for learning these representations in an unsupervised way. Even though the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-09 Mirco Ravanelli , Yoshua Bengio

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to acheive state-of-the-art performances in many applications, and one of the keys for achieving this is the existence of well trained classifiers on benchmark datasets which can be used as backbone…

Machine Learning · Computer Science 2022-10-04 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Studying the Interplay between Information Loss and Operation Loss in Representations for Classification

Information-theoretic measures have been widely adopted in the design of features for learning and decision problems. Inspired by this, we look at the relationship between i) a weak form of information loss in the Shannon sense and ii) the…

Machine Learning · Computer Science 2022-01-03 Jorge F. Silva , Felipe Tobar , Mario Vicuña , Felipe Cordova

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to achieve state-of-the-art performances in many applications, and a key is the existence of well trained classifiers on benchmark datasets. As a main-stream loss function, the cross entropy can…

Machine Learning · Computer Science 2022-09-22 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Discriminative Mutual Information Estimation for the Design of Channel Capacity Driven Autoencoders

The development of optimal and efficient machine learning-based communication systems is likely to be a key enabler of beyond 5G communication technologies. In this direction, physical layer design has been recently reformulated under a…

Information Theory · Computer Science 2021-11-16 Nunzio A. Letizia , Andrea M. Tonello

Language Model Memory and Memory Models for Language

The ability of machine learning models to store input information in hidden layer vector embeddings, analogous to the concept of `memory', is widely employed but not well characterized. We find that language model embeddings typically…

Computation and Language · Computer Science 2026-05-20 Benjamin L. Badger

Modeling Lost Information in Lossy Image Compression

Lossy image compression is one of the most commonly used operators for digital images. Most recently proposed deep-learning-based image compression methods leverage the auto-encoder structure, and reach a series of promising results in this…

Computer Vision and Pattern Recognition · Computer Science 2020-07-09 Yaolong Wang , Mingqing Xiao , Chang Liu , Shuxin Zheng , Tie-Yan Liu

Information Structure in Mappings: An Approach to Learning, Representation, and Generalisation

Despite the remarkable success of large large-scale neural networks, we still lack unified notation for thinking about and describing their representational spaces. We lack methods to reliably describe how their representations are…

Machine Learning · Computer Science 2025-06-02 Henry Conklin

Information-Theoretic Framework for Understanding Modern Machine-Learning

We introduce an information-theoretic framework that views learning as universal prediction under log loss, characterized through regret bounds. Central to the framework is an effective notion of architecture-based model complexity, defined…

Machine Learning · Computer Science 2025-11-04 Meir Feder , Ruediger Urbanke , Yaniv Fogel

Rethinking the Understanding Ability across LLMs through Mutual Information

Recent advances in large language models (LLMs) have revolutionized natural language processing, yet evaluating their intrinsic linguistic understanding remains challenging. Moving beyond specialized evaluation tasks, we propose an…

Computation and Language · Computer Science 2025-06-02 Shaojie Wang , Sirui Ding , Na Zou

Informational Embodiment: Computational role of information structure in codes and robots

The body morphology plays an important role in the way information is perceived and processed by an agent. We address an information theory (IT) account on how the precision of sensors, the accuracy of motors, their placement, the body…

Robotics · Computer Science 2024-08-26 Alexandre Pitti , Kohei Nakajima , Yasuo Kuniyoshi

Performance Indicator in Multilinear Compressive Learning

Recently, the Multilinear Compressive Learning (MCL) framework was proposed to efficiently optimize the sensing and learning steps when working with multidimensional signals, i.e. tensors. In Compressive Learning in general, and in MCL in…

Computer Vision and Pattern Recognition · Computer Science 2020-09-23 Dat Thanh Tran , Moncef Gabbouj , Alexandros Iosifidis

Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval

Accurately matching visual and textual data in cross-modal retrieval has been widely studied in the multimedia community. To address these challenges posited by the heterogeneity gap and the semantic gap, we propose integrating Shannon…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Wei Chen , Yu Liu , Erwin M. Bakker , Michael S. Lew

Learning is Forgetting: LLM Training As Lossy Compression

Despite the increasing prevalence of large language models (LLMs), we still have a limited understanding of how their representational spaces are structured. This limits our ability to interpret how and what they learn or relate them to…

Machine Learning · Computer Science 2026-04-10 Henry C. Conklin , Tom Hosking , Tan Yi-Chern , Julian Gold , Jonathan D. Cohen , Thomas L. Griffiths , Max Bartolo , Seraphina Goldfarb-Tarrant