Related papers: Prediction by Compression

An Efficient Compression of Deep Neural Network Checkpoints Based on Prediction and Context Modeling

This paper is dedicated to an efficient compression of weights and optimizer states (called checkpoints) obtained at different stages during a neural network training process. First, we propose a prediction-based compression approach, where…

Machine Learning · Computer Science 2025-06-16 Yuriy Kim , Evgeny Belyaev

Learning Representations by Maximizing Compression

We give an algorithm that learns a representation of data through compression. The algorithm 1) predicts bits sequentially from those previously seen and 2) has a structure and a number of computations similar to an autoencoder. The…

Computer Vision and Pattern Recognition · Computer Science 2011-08-05 Karol Gregor , Yann LeCun

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the symbol-level statistics, with the information source modeled as a long sequence of i.i.d. random variables or a stochastic process, thus establishing the fundamental…

Computation and Language · Computer Science 2023-04-04 Mingxiao Li , Rui Jin , Liyao Xiang , Kaiming Shen , Shuguang Cui

Real-Time Text Transmission via LLM-Based Entropy Coding over Fixed-Rate Channels

Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…

Information Theory · Computer Science 2026-05-05 Vishnu Teja Kunde , Jean-Francois Chamberland , Krishna R. Narayanan , Jamison Ebert

PivotCompress: Compression by Sorting

Sorted data is usually easier to compress than unsorted permutations of the same data. This motivates a simple compression scheme: specify the sorted permutation of the data along with a representation of the sorted data compressed…

Data Structures and Algorithms · Computer Science 2014-11-24 Oscar Stiffelman

Time-universal data compression and prediction

Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest…

Information Theory · Computer Science 2018-09-11 Boris Ryabko

Compression, Generalization and Learning

A compression function is a map that slims down an observational set into a subset of reduced size, while preserving its informational content. In multiple applications, the condition that one new observation makes the compressed set change…

Machine Learning · Computer Science 2024-01-09 Marco C. Campi , Simone Garatti

Compression Scheme for Faster and Secure Data Transmission Over Internet

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…

Performance · Computer Science 2007-05-23 B. S. Shajeemohan , Dr. V. K. Govindan

The Minimal Compression Rate for Similarity Identification

Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case…

Information Theory · Computer Science 2013-12-10 Amir Ingber , Tsachy Weissman

Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models

We explore the idea of compressing the prompts used to condition language models, and show that compressed prompts can retain a substantive amount of information about the original prompt. For severely compressed prompts, while fine-grained…

Computation and Language · Computer Science 2022-10-10 David Wingate , Mohammad Shoeybi , Taylor Sorensen

Optimal Compression for Minimizing Classification Error Probability: an Information-Theoretic Approach

We formulate the problem of performing optimal data compression under the constraints that compressed data can be used for accurate classification in machine learning. We show that this translates to a problem of minimizing the mutual…

Signal Processing · Electrical Eng. & Systems 2022-11-04 Jingchao Gao , Ao Tang , Weiyu Xu

An Introduction to Neural Data Compression

Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression…

Machine Learning · Computer Science 2023-08-22 Yibo Yang , Stephan Mandt , Lucas Theis

A Model-Driven Lossless Compression Algorithm Resistant to Mismatch

Due to the fundamental connection between next-symbol prediction and compression, modern predictive models, such as large language models (LLMs), can be combined with entropy coding to achieve compression rates that surpass those of…

Information Theory · Computer Science 2026-01-27 Cordelia Hu , Jennifer Tang

Automating Political Bias Prediction

Every day media generate large amounts of text. An unbiased view on media reports requires an understanding of the political bias of media content. Assistive technology for estimating the political bias of texts can be helpful in this…

Social and Information Networks · Computer Science 2016-08-09 Felix Biessmann

Synchronizing Probabilities in Model-Driven Lossless Compression

It is well-known in the field of lossless data compression that probabilistic next-symbol prediction can be used to compress sequences of symbols. Deep neural networks are able to capture rich dependencies in data, offering a powerful means…

Information Theory · Computer Science 2026-03-10 Aviv Adler , Jennifer Tang

Using compression to identify acronyms in text

Text mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key…

Digital Libraries · Computer Science 2007-05-23 Stuart Yeates , David Bainbridge , Ian H. Witten

Using Image Transformations to Learn Network Structure

Many learning tasks require observing a sequence of images and making a decision. In a transportation problem of designing and planning for shipping boxes between nodes, we show how to treat the network of nodes and the flows between them…

Machine Learning · Statistics 2023-06-12 Brayan Ortiz , Amitabh Sinha

Data Compression with Prime Numbers

A compression algorithm is presented that uses the set of prime numbers. Sequences of numbers are correlated with the prime numbers, and labeled with the integers. The algorithm can be iterated on data sets, generating factors of doubles on…

General Physics · Physics 2007-05-23 Gordon Chalmers

Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction

As large language models (LLMs) continue to be deployed and utilized across domains, the volume of LLM-generated data is growing rapidly. This trend highlights the increasing importance of effective and lossless compression for such data in…

Machine Learning · Computer Science 2025-05-13 Yu Mao , Holger Pirk , Chun Jason Xue

On the connection between compression learning and scenario based optimization

We investigate the connections between compression learning and scenario based optimization. We first show how to strengthen, or relax the consistency assumption at the basis of compression learning and study the learning and generalization…

Systems and Control · Computer Science 2014-03-07 Kostas Margellos , Maria Prandini , John Lygeros