Related papers: Fast Codes for Large Alphabets

Fast Recursive Coding Based on Grouping of Symbols

A novel fast recursive coding technique is proposed. It operates with only integer values not longer 8 bits and is multiplication free. Recursion the algorithm is based on indirectly provides rather effective coding of symbols for very…

Information Theory · Computer Science 2007-08-22 Nikolay Ponomarenko , Vladimir Lukin , Karen Egiazarian , Jaakko Astola , Boris Y Ryabko

Benefiting from Disorder: Source Coding for Unordered Data

The order of letters is not always relevant in a communication task. This paper discusses the implications of order irrelevance on source coding, presenting results in several major branches of source coding theory: lossless coding,…

Information Theory · Computer Science 2007-08-20 Lav R. Varshney , Vivek K. Goyal

Large Alphabet Source Coding using Independent Component Analysis

Large alphabet source coding is a basic and well-studied problem in data compression. It has many applications such as compression of natural language text, speech and images. The classic perception of most commonly used methods is that a…

Information Theory · Computer Science 2016-07-26 Amichai Painsky , Saharon Rosset , Meir Feder

Real-Time Variable-to-Fixed Lossless Source Coding of Randomly Arriving Symbols

We address the recently suggested problem of causal lossless coding of a randomly arriving source samples. We construct variable-to-fixed coding schemes and show that they outperform the previously considered fixed-to-variable schemes when…

Information Theory · Computer Science 2020-10-27 Uri Abend , Anatoly Khina

Universal Weak Variable-Length Source Coding on Countable Infinite Alphabets

Motivated from the fact that universal source coding on countably infinite alphabets is not feasible, this work introduces the notion of almost lossless source coding. Analog to the weak variable-length source coding problem studied by Han…

Information Theory · Computer Science 2021-11-30 Jorge F. Silva , Pablo Piantanida

More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding

There is a large literature devoted to the problem of finding an optimal (min-cost) prefix-free code with an unequal letter-cost encoding alphabet of size. While there is no known polynomial time algorithm for solving it optimally there are…

Information Theory · Computer Science 2007-07-13 Mordecai Golin , Li Jian

Compressing Multisets with Large Alphabets using Bits-Back Coding

Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression…

Information Theory · Computer Science 2023-02-28 Daniel Severo , James Townsend , Ashish Khisti , Alireza Makhzani , Karen Ullrich

Universal Lossless Compression with Unknown Alphabets - The Average Case

Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive…

Information Theory · Computer Science 2016-11-17 Gil I. Shamir

Finite Alphabet Fast List Decoders for Polar Codes

The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process…

Information Theory · Computer Science 2024-06-21 Syed Aizaz Ali Shah , Gerhard Bauch

Optimal Merging Algorithms for Lossless Codes with Generalized Criteria

This paper presents lossless prefix codes optimized with respect to a pay-off criterion consisting of a convex combination of maximum codeword length and average codeword length. The optimal codeword lengths obtained are based on a new…

Information Theory · Computer Science 2012-08-18 Themistoklis Charalambous , Charalambos D. Charalambous , Farzad Rezaei

Tales of Huffman

We study the new problem of Huffman-like codes subject to individual restrictions on the code-word lengths of a subset of the source words. These are prefix codes with minimal expected code-word length for a random source where additionally…

Information Theory · Computer Science 2007-07-13 Paul M. B. Vitanyi , Zvi Lotker

Sparse Sequential Dirichlet Coding

This short paper describes a simple coding technique, Sparse Sequential Dirichlet Coding, for multi-alphabet memoryless sources. It is appropriate in situations where only a small, unknown subset of the possible alphabet symbols can be…

Information Theory · Computer Science 2012-06-19 Joel Veness , Marcus Hutter

Lossless Coding with Generalised Criteria

This paper presents prefix codes which minimize various criteria constructed as a convex combination of maximum codeword length and average codeword length or maximum redundancy and average redundancy, including a convex combination of the…

Information Theory · Computer Science 2011-02-11 Charalambos D. Charalambous , Themistoklis Charalambous , Farzad Rezaei

Asymptotics and Non-asymptotics for Universal Fixed-to-Variable Source Coding

Universal fixed-to-variable lossless source coding for memoryless sources is studied in the finite blocklength and higher-order asymptotics regimes. Optimal third-order coding rates are derived for general fixed-to-variable codes and for…

Information Theory · Computer Science 2014-12-16 Oliver Kosut , Lalitha Sankar

Applications of Universal Source Coding to Statistical Analysis of Time Series

We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a…

Information Theory · Computer Science 2008-09-09 Boris Ryabko

Subspace-Aware Index Codes

In this paper, we generalize the well-known index coding problem to exploit the structure in the source-data to improve system throughput. In many applications, the data to be transmitted may lie (or can be well approximated) in a…

Information Theory · Computer Science 2017-04-11 Bhavya Kailkhura , Lakshmi Narasimhan Theagarajan , Pramod K. Varshney

Optimal Prefix Codes for Infinite Alphabets with Nonlinear Costs

Let $P = \{p(i)\}$ be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial $P$ for which known methods find a…

Information Theory · Computer Science 2016-11-17 Michael B. Baer

Code Similarity on High Level Programs

This paper presents a new approach for code similarity on High Level programs. Our technique is based on Fast Dynamic Time Warping, that builds a warp path or points relation with local restrictions. The source code is represented into Time…

Computer Vision and Pattern Recognition · Computer Science 2007-10-31 M. Miron Bernal , H. Coyote Estrada , J. Figueroa Nazuno

About Optimal Prefix Codes over Countably Infinite Alphabets: Probabilistic Intervals for the Codeword Lengths Assignment

For the discrete memoryless sources with a countably infinite alphabet, we prove that for any positive integer $k$, there exists a corresponding probability interval such that if the largest symbol probability $p_{1}$ falls in this…

Information Theory · Computer Science 2026-04-21 Hongyang Liu , Wei Yan

Pattern Coding Meets Censoring: (almost) Adaptive Coding on Countable Alphabets

Adaptive coding faces the following problem: given a collection of source classes such that each class in the collection has non-trivial minimax redundancy rate, can we design a single code which is asymptotically minimax over each class in…

Information Theory · Computer Science 2016-09-02 Anna Ben-Hamou , Stephane Boucheron , Elisabeth Gassiat