Related papers: Entropy coding with Variable Length Re-writing Sys…

EAH: A New Encoder based on Adaptive Variable-length Codes

Adaptive variable-length codes associate a variable-length codeword to the symbol being encoded depending on the previous symbols in the input string. This class of codes has been recently presented in [Dragos Trinca, arXiv:cs.DS/0505007]…

Data Structures and Algorithms · Computer Science 2007-05-23 Dragos Trinca

Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions

In this paper, we propose a source coding scheme that represents data from unknown distributions through frequency and support information. Existing encoding schemes often compress data by sacrificing computational efficiency or by assuming…

Information Theory · Computer Science 2024-10-28 Leah Woldemariam , Hang Liu , Anna Scaglione

Variable-Length Lossy Compression Allowing Positive Overflow and Excess Distortion Probabilities

This paper investigates the problem of variable-length lossy source coding allowing a positive excess distortion probability and an overflow probability of codeword lengths. Novel one-shot achievability and converse bounds of the optimal…

Information Theory · Computer Science 2018-12-17 Shota Saito , Hideki Yagi , Toshiyasu Matsushima

Universal Variable-to-Fixed Length Lossy Compression at Finite Blocklengths

We consider universal variable-to-fixed length compression of memoryless sources with a fidelity criterion. We design a dictionary codebook over the reproduction alphabet which is used to parse the source stream. Once a source subsequence…

Information Theory · Computer Science 2022-11-24 Nematollah Iri

Enumerative Data Compression with Non-Uniquely Decodable Codes

Non-uniquely decodable codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non-prefix-free codes, where a codeword can be a prefix of other(s), and…

Data Structures and Algorithms · Computer Science 2019-11-14 M. Oğuzhan Külekci , Yasin Öztürk , Elif Altunok , Can Altıniğne

Real-Time Text Transmission via LLM-Based Entropy Coding over Fixed-Rate Channels

Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…

Information Theory · Computer Science 2026-05-05 Vishnu Teja Kunde , Jean-Francois Chamberland , Krishna R. Narayanan , Jamison Ebert

Encoding of probability distributions for Asymmetric Numeral Systems

Many data compressors regularly encode probability distributions for entropy coding - requiring minimal description length type of optimizations. Canonical prefix/Huffman coding usually just writes lengths of bit sequences, this way…

Information Theory · Computer Science 2022-07-05 Jarek Duda

Combinatorial Entropy Encoding

This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of…

Information Theory · Computer Science 2017-03-24 Abu Bakar Siddique

Entropy Coding of Unordered Data Structures

We present shuffle coding, a general method for optimal compression of sequences of unordered objects using bits-back coding. Data structures that can be compressed using shuffle coding include multisets, graphs, hypergraphs, and others. We…

Machine Learning · Computer Science 2024-08-19 Julius Kunze , Daniel Severo , Giulio Zani , Jan-Willem van de Meent , James Townsend

Binary code optimization

This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords.…

Information Theory · Computer Science 2021-10-05 Parviz Gharehbagheri , Sayeed Hamid Haji Sayeed Javadi , Parvaneh Asghari , Naser Gharehbagheri

This paper presents new lower and upper bounds for the compression rate of binary prefix codes optimized over memoryless sources according to various nonlinear codeword length objectives. Like the most well-known redundancy bounds for…

Information Theory · Computer Science 2010-10-08 Michael B. Baer

Improving Run Length Encoding by Preprocessing

The Run Length Encoding (RLE) compression method is a long standing simple lossless compression scheme which is easy to implement and achieves a good compression on input data which contains repeating consecutive symbols. In its pure form…

Data Structures and Algorithms · Computer Science 2021-04-01 Sven Fiergolla , Petra Wolf

Run-Length Encoding in a Finite Universe

Text compression schemes and compact data structures usually combine sophisticated probability models with basic coding methods whose average codeword length closely match the entropy of known distributions. In the frequent case where basic…

Information Theory · Computer Science 2019-10-02 N. Jesper Larsson

Fine Asymptotics for Universal One-to-One Compression of Parametric Sources

Universal source coding at short blocklengths is considered for an exponential family of distributions. The \emph{Type Size} code has previously been shown to be optimal up to the third-order rate for universal compression of all memoryless…

Information Theory · Computer Science 2016-12-21 Nematollah Iri , Oliver Kosut

Lists that are smaller than their parts: A coding approach to tunable secrecy

We present a new information-theoretic definition and associated results, based on list decoding in a source coding setting. We begin by presenting list-source codes, which naturally map a key length (entropy) to list size. We then show…

Information Theory · Computer Science 2012-10-09 Flavio du Pin Calmon , Muriel Médard , Linda M. Zeger , João Barros , Mark M. Christiansen , Ken. R. Duffy

Prefix Codes: Equiprobable Words, Unequal Letter Costs

Describes a near-linear-time algorithm for a variant of Huffman coding, in which the letters may have non-uniform lengths (as in Morse code), but with the restriction that each word to be encoded has equal probability. [See also ``Huffman…

Data Structures and Algorithms · Computer Science 2015-06-02 Mordecai Golin , Neal E. Young

Fundamental Limits of Universal Variable-to-Fixed Length Coding of Parametric Sources

Universal variable-to-fixed (V-F) length coding of $d$-dimensional exponential family of distributions is considered. We propose an achievable scheme consisting of a dictionary, used to parse the source output stream, making use of the…

Information Theory · Computer Science 2017-08-02 Nematollah Iri , Oliver Kosut

Minimum Description Length codes are critical

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are…

Methodology · Statistics 2018-10-03 Ryan John Cubero , Matteo Marsili , Yasser Roudi

Concentric Permutation Source Codes

Permutation codes are a class of structured vector quantizers with a computationally-simple encoding procedure based on sorting the scalar components. Using a codebook comprising several permutation codes as subcodes preserves the…

Information Theory · Computer Science 2015-03-13 Ha Q. Nguyen , Lav R. Varshney , Vivek K Goyal

Investigations on Algorithm Selection for Interval-Based Coding Methods

There is a class of entropy-coding methods which do not substitute symbols by code words (such as Huffman coding), but operate on intervals or ranges. This class includes three prominent members: conventional arithmetic coding, range…

Information Theory · Computer Science 2025-07-04 Tilo Strutz , Nico Schreiber