Related papers: Optimal Prefix Free Code in Linear Time

Optimal Prefix Free Codes With Partial Sorting

We describe an algorithm computing an optimal prefix free code for $n$ unsorted positive weights in time within $O(n(1+\lg \alpha))\subseteq O(n\lg n)$, where the alternation $\alpha\in[1..n-1]$ measures the amount of sorting required by…

Data Structures and Algorithms · Computer Science 2016-02-02 Jérémy Barbay

Optimal Prefix Codes with Fewer Distinct Codeword Lengths are Faster to Construct

A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the…

Data Structures and Algorithms · Computer Science 2016-09-30 Ahmed Belal , Amr Elmasry

Reserved-Length Prefix Coding

Huffman coding finds an optimal prefix code for a given probability mass function. Consider situations in which one wishes to find an optimal code with the restriction that all codewords have lengths that lie in a user-specified set of…

Information Theory · Computer Science 2008-01-03 Michael B. Baer

Space-Efficient Huffman Codes Revisited

Canonical Huffman code is an optimal prefix-free compression code whose codewords enumerated in the lexicographical order form a list of binary words in non-decreasing lengths. Gagie et al. (2015) gave a representation of this coding…

Data Structures and Algorithms · Computer Science 2021-08-19 Szymon Grabowski , Dominik Köppl

Twenty (or so) Questions: $D$-ary Length-Bounded Prefix Coding

Efficient optimal prefix coding has long been accomplished via the Huffman algorithm. However, there is still room for improvement and exploration regarding variants of the Huffman problem. Length-limited Huffman coding, useful for many…

Information Theory · Computer Science 2007-07-13 Michael B. Baer

Properties of optimal prefix-free machines as instantaneous codes

The optimal prefix-free machine U is a universal decoding algorithm used to define the notion of program-size complexity H(s) for a finite binary string s. Since the set of all halting inputs for U is chosen to form a prefix-free set, the…

Information Theory · Computer Science 2016-11-15 Kohtaro Tadaki

$D$-ary Bounded-Length Huffman Coding

Efficient optimal prefix coding has long been accomplished via the Huffman algorithm. However, there is still room for improvement and exploration regarding variants of the Huffman problem. Length-limited Huffman coding, useful for many…

Information Theory · Computer Science 2007-07-13 Michael B. Baer

Efficient and Compact Representations of Some Non-Canonical Prefix-Free Codes

For many kinds of prefix-free codes there are efficient and compact alternatives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in…

Data Structures and Algorithms · Computer Science 2021-04-02 Antonio Fariña , Travis Gagie , Szymon Grabowski , Giovanni Manzini , Gonzalo Navarro , Alberto Ordóñez

A Generic Top-Down Dynamic-Programming Approach to Prefix-Free Coding

Given a probability distribution over a set of n words to be transmitted, the Huffman Coding problem is to find a minimal-cost prefix free code for transmitting those words. The basic Huffman coding problem can be solved in O(n log n) time…

Data Structures and Algorithms · Computer Science 2008-09-29 Mordecai Golin , Xiaoming Xu , Jiajin Yu

Worst-case optimal adaptive alphabetic prefix-free coding

We give the first algorithm for adaptive alphabetic prefix-free coding that is worst-case optimal in terms of time and compression when $\sigma \in o \left( \frac{n^{1 / 2}}{\log n} \right)$, where $\sigma$ is the size of the alphabet and…

Data Structures and Algorithms · Computer Science 2026-01-08 Travis Gagie

Efficient and Compact Representations of Prefix Codes

Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix…

Data Structures and Algorithms · Computer Science 2015-06-30 Travis Gagie , Gonzalo Navarro , Yakov Nekrich , Alberto Ordóñez

More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding

There is a large literature devoted to the problem of finding an optimal (min-cost) prefix-free code with an unequal letter-cost encoding alphabet of size. While there is no known polynomial time algorithm for solving it optimally there are…

Information Theory · Computer Science 2007-07-13 Mordecai Golin , Li Jian

Fast Compressed Self-Indexes with Deterministic Linear-Time Construction

We introduce a compressed suffix array representation that, on a text $T$ of length $n$ over an alphabet of size $\sigma$, can be built in $O(n)$ deterministic time, within $O(n\log\sigma)$ bits of working space, and counts the number of…

Data Structures and Algorithms · Computer Science 2017-09-05 J. Ian Munro , Gonzalo Navarro , Yakov Nekrich

Optimal Time and Space Construction of Suffix Arrays and LCP Arrays for Integer Alphabets

Suffix arrays and LCP arrays are one of the most fundamental data structures widely used for various kinds of string processing. We consider two problems for a read-only string of length $N$ over an integer alphabet $[1, \dots, \sigma]$ for…

Data Structures and Algorithms · Computer Science 2019-07-16 Keisuke Goto

On the Construction of Prefix-Free and Fix-Free Codes with Specified Codeword Compositions

We investigate the construction of prefix-free and fix-free codes with specified codeword compositions. We present a polynomial time algorithm which constructs a fix-free code with the same codeword compositions as a given code for a…

Information Theory · Computer Science 2012-02-10 Ali Kakhbod , Morteza Zadimoghaddam

Optimal Computation of Overabundant Words

The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word $w$ in a given sequence $x$ can be used for classifying $w$ as avoided or overabundant. The definitions used for the expectation…

Data Structures and Algorithms · Computer Science 2017-05-10 Yannis Almirantis , Panagiotis Charalampopoulos , Jia Gao , Costas S. Iliopoulos , Manal Mohamed , Solon P. Pissis , Dimitris Polychronopoulos

Optimal Prefix Codes for Infinite Alphabets with Nonlinear Costs

Let $P = \{p(i)\}$ be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial $P$ for which known methods find a…

Information Theory · Computer Science 2016-11-17 Michael B. Baer

Efficient algorithms for modifying and sampling from a categorical distribution

Probabilistic programming languages and other machine learning applications often require samples to be generated from a categorical distribution where the probability of each one of $n$ categories is specified as a parameter. If the…

Data Structures and Algorithms · Computer Science 2019-06-28 Daniel Tang

Tales of Huffman

We study the new problem of Huffman-like codes subject to individual restrictions on the code-word lengths of a subset of the source words. These are prefix codes with minimal expected code-word length for a random source where additionally…

Information Theory · Computer Science 2007-07-13 Paul M. B. Vitanyi , Zvi Lotker

Top Tree Compression of Tries

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann