Related papers: Combinatorial Entropy Encoding

Semantic Huffman Coding using Synonymous Mapping

Semantic communication stands out as a highly promising avenue for future developments in communications. Theoretically, source compression coding based on semantics can achieve lower rates than Shannon entropy. This paper introduces a…

Information Theory · Computer Science 2024-01-29 Jin Xu , Kai Niu , Zijian Liang , Ping Zhang

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the symbol-level statistics, with the information source modeled as a long sequence of i.i.d. random variables or a stochastic process, thus establishing the fundamental…

Computation and Language · Computer Science 2023-04-04 Mingxiao Li , Rui Jin , Liyao Xiang , Kaiming Shen , Shuguang Cui

PivotCompress: Compression by Sorting

Sorted data is usually easier to compress than unsorted permutations of the same data. This motivates a simple compression scheme: specify the sorted permutation of the data along with a representation of the sorted data compressed…

Data Structures and Algorithms · Computer Science 2014-11-24 Oscar Stiffelman

Permutation Entropy for Signal Analysis

Shannon Entropy is the preeminent tool for measuring the level of uncertainty (and conversely, information content) in a random variable. In the field of communications, entropy can be used to express the information content of given…

Information Theory · Computer Science 2024-11-06 Bill Kay , Audun Myers , Thad Boydston , Emily Ellwein , Cameron Mackenzie , Iliana Alvarez , Erik Lentz

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding

The modern data compression is mainly based on two approaches to entropy coding: Huffman (HC) and arithmetic/range coding (AC). The former is much faster, but approximates probabilities with powers of 2, usually leading to relatively low…

Information Theory · Computer Science 2014-01-07 Jarek Duda

Data Compression with Relative Entropy Coding

Over the last few years, machine learning unlocked previously infeasible features for compression, such as providing guarantees for users' privacy or tailoring compression to specific data statistics (e.g., satellite images or audio…

Information Theory · Computer Science 2026-03-25 Gergely Flamich

Real-Time Text Transmission via LLM-Based Entropy Coding over Fixed-Rate Channels

Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…

Information Theory · Computer Science 2026-05-05 Vishnu Teja Kunde , Jean-Francois Chamberland , Krishna R. Narayanan , Jamison Ebert

A measure of compression gain for new symbols in data-compression

Huffman encoding is often improved by using block codes, for example a 3-block would be an alphabet consisting of each possible combination of three characters. We take the approach of starting with a base alphabet and expanding it to…

Information Theory · Computer Science 2016-11-08 Richard M Fredlund

On nonlinear compression costs: when Shannon meets R\'enyi

Shannon entropy is the shortest average codeword length a lossless compressor can achieve by encoding i.i.d. symbols. However, there are cases in which the objective is to minimize the \textit{exponential} average codeword length, i.e. when…

Information Theory · Computer Science 2024-06-10 Andrea Somazzi , Paolo Ferragina , Diego Garlaschelli

BIN@ERN: Binary-Ternary Compressing Data Coding

This paper describes a new method of data encoding which may be used in various modern digital, computer and telecommunication systems and devices. The method permits the compression of data for storage or transmission, allowing the exact…

Information Theory · Computer Science 2012-01-27 Igor Nesiolovskiy , Artem Nesiolovskiy

Compression and information entropy of binary strings from the collision history of three hard balls

We investigate how to measure and define the entropy of a simple chaotic system, three hard spheres on a ring. A novel approach is presented, which does not assume the ergodic hypothesis. It consists of transforming the particles collision…

Computational Physics · Physics 2023-05-08 Matej Vedak , Graeme J Ackland

Encoding of probability distributions for Asymmetric Numeral Systems

Many data compressors regularly encode probability distributions for entropy coding - requiring minimal description length type of optimizations. Canonical prefix/Huffman coding usually just writes lengths of bit sequences, this way…

Information Theory · Computer Science 2022-07-05 Jarek Duda

Cryptographic Compression

We introduce a protocol called ENCORE which simultaneously compresses and encrypts data in a one-pass process that can be implemented efficiently and possesses a number of desirable features as a streaming encoder/decoder. Motivated by the…

Cryptography and Security · Computer Science 2025-01-28 Joshua Cooper , Grant Fickes

On principles of large deviation and selected data compression

The Shannon Noiseless coding theorem (the data-compression principle) asserts that for an information source with an alphabet $\mathcal X=\{0,\ldots ,\ell -1\}$ and an asymptotic equipartition property, one can reduce the number of stored…

Information Theory · Computer Science 2016-04-26 Yuri Suhov , Izabella Stuhl

Investigations on Algorithm Selection for Interval-Based Coding Methods

There is a class of entropy-coding methods which do not substitute symbols by code words (such as Huffman coding), but operate on intervals or ranges. This class includes three prominent members: conventional arithmetic coding, range…

Information Theory · Computer Science 2025-07-04 Tilo Strutz , Nico Schreiber

Lightweight compression with encryption based on Asymmetric Numeral Systems

Data compression combined with effective encryption is a common requirement of data storage and transmission. Low cost of these operations is often a high priority in order to increase transmission speed and reduce power usage. This…

Information Theory · Computer Science 2023-03-24 Jarek Duda , Marcin Niemiec

Entropy estimation of symbol sequences

We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our…

Statistical Mechanics · Physics 2017-04-24 Thomas Schürmann , Peter Grassberger

Single-Stage Huffman Encoder for ML Compression

Training and serving Large Language Models (LLMs) require partitioning data across multiple accelerators, where collective operations are frequently bottlenecked by network bandwidth. Lossless compression using Huffman codes is an effective…

Machine Learning · Computer Science 2026-01-16 Aditya Agrawal , Albert Magyar , Hiteshwar Eswaraiah , Patrick Sheridan , Pradeep Janedula , Ravi Krishnan Venkatesan , Krishna Nair , Ravi Iyer

A new hybrid jpeg image compression scheme using symbol reduction technique

Lossy JPEG compression is a widely used compression technique. Normally the JPEG standard technique uses three process mapping reduces interpixel redundancy, quantization, which is lossy process and entropy encoding, which is considered…

Multimedia · Computer Science 2012-08-15 Bheshaj Kumar , Kavita Thakur , G. R. Sinha

Entropy Coding of Unordered Data Structures

We present shuffle coding, a general method for optimal compression of sequences of unordered objects using bits-back coding. Data structures that can be compressed using shuffle coding include multisets, graphs, hypergraphs, and others. We…

Machine Learning · Computer Science 2024-08-19 Julius Kunze , Daniel Severo , Giulio Zani , Jan-Willem van de Meent , James Townsend