Related papers: Worst-Case Optimal Adaptive Prefix Coding

Worst-case optimal adaptive alphabetic prefix-free coding

We give the first algorithm for adaptive alphabetic prefix-free coding that is worst-case optimal in terms of time and compression when $\sigma \in o \left( \frac{n^{1 / 2}}{\log n} \right)$, where $\sigma$ is the size of the alphabet and…

Data Structures and Algorithms · Computer Science 2026-01-08 Travis Gagie

Efficient and Compact Representations of Prefix Codes

Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix…

Data Structures and Algorithms · Computer Science 2015-06-30 Travis Gagie , Gonzalo Navarro , Yakov Nekrich , Alberto Ordóñez

Low-Memory Adaptive Prefix Coding

In this paper we study the adaptive prefix coding problem in cases where the size of the input alphabet is large. We present an online prefix coding algorithm that uses $O(\sigma^{1 / \lambda + \epsilon}) $ bits of space for any constants…

Data Structures and Algorithms · Computer Science 2008-11-24 Travis Gagie , Marek Karpinski , Yakov Nekrich

Efficient and Compact Representations of Some Non-Canonical Prefix-Free Codes

For many kinds of prefix-free codes there are efficient and compact alternatives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in…

Data Structures and Algorithms · Computer Science 2021-04-02 Antonio Fariña , Travis Gagie , Szymon Grabowski , Giovanni Manzini , Gonzalo Navarro , Alberto Ordóñez

Reserved-Length Prefix Coding

Huffman coding finds an optimal prefix code for a given probability mass function. Consider situations in which one wishes to find an optimal code with the restriction that all codewords have lengths that lie in a user-specified set of…

Information Theory · Computer Science 2008-01-03 Michael B. Baer

Fast and Compact Prefix Codes

It is well-known that, given a probability distribution over $n$ characters, in the worst case it takes (\Theta (n \log n)) bits to store a prefix code with minimum expected codeword length. However, in this paper we first show that, for…

Data Structures and Algorithms · Computer Science 2009-05-20 Travis Gagie , Gonzalo Navarro , Yakov Nekrich

Optimal Dynamic Sequence Representations

We describe a data structure that supports access, rank and select queries, as well as symbol insertions and deletions, on a string $S[1,n]$ over alphabet $[1..\sigma]$ in time $O(\lg n/\lg\lg n)$, which is optimal even on binary sequences…

Data Structures and Algorithms · Computer Science 2013-02-04 Gonzalo Navarro , Yakov Nekrich

Space-Efficient Huffman Codes Revisited

Canonical Huffman code is an optimal prefix-free compression code whose codewords enumerated in the lexicographical order form a list of binary words in non-decreasing lengths. Gagie et al. (2015) gave a representation of this coding…

Data Structures and Algorithms · Computer Science 2021-08-19 Szymon Grabowski , Dominik Köppl

Optimal Prefix Free Code in Linear Time

We describe an algorithm computing an optimal prefix free code from $N$ unsorted positive integer weights in time linear in the number of machine words holding those weights. This algorithm takes advantage of common non-algebraic…

Data Structures and Algorithms · Computer Science 2017-03-02 Jérémy Barbay

Finding Short Synchronizing Words for Prefix Codes

We study the problems of finding a shortest synchronizing word and its length for a given prefix code. This is done in two different settings: when the code is defined by an arbitrary decoder recognizing its star and when the code is…

Formal Languages and Automata Theory · Computer Science 2018-06-19 Andrew Ryzhikov , Marek Szykuła

Optimal Prefix Free Codes With Partial Sorting

We describe an algorithm computing an optimal prefix free code for $n$ unsorted positive weights in time within $O(n(1+\lg \alpha))\subseteq O(n\lg n)$, where the alternation $\alpha\in[1..n-1]$ measures the amount of sorting required by…

Data Structures and Algorithms · Computer Science 2016-02-02 Jérémy Barbay

A Textbook Solution for Dynamic Strings

We consider the problem of maintaining a collection of strings while efficiently supporting splits and concatenations on them, as well as comparing two substrings, and computing the longest common prefix between two suffixes. This problem…

Data Structures and Algorithms · Computer Science 2024-08-15 Zsuzsanna Lipták , Francesco Masillo , Gonzalo Navarro

Optimal Prefix Codes with Fewer Distinct Codeword Lengths are Faster to Construct

A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the…

Data Structures and Algorithms · Computer Science 2016-09-30 Ahmed Belal , Amr Elmasry

Weighted Adaptive Coding

Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least…

Data Structures and Algorithms · Computer Science 2020-05-19 Aharon Fruchtman , Yoav Gross , Shmuel T. Klein , Dana Shapira

Fast Compressed Self-Indexes with Deterministic Linear-Time Construction

We introduce a compressed suffix array representation that, on a text $T$ of length $n$ over an alphabet of size $\sigma$, can be built in $O(n)$ deterministic time, within $O(n\log\sigma)$ bits of working space, and counts the number of…

Data Structures and Algorithms · Computer Science 2017-09-05 J. Ian Munro , Gonzalo Navarro , Yakov Nekrich

$D$-ary Bounded-Length Huffman Coding

Efficient optimal prefix coding has long been accomplished via the Huffman algorithm. However, there is still room for improvement and exploration regarding variants of the Huffman problem. Length-limited Huffman coding, useful for many…

Information Theory · Computer Science 2007-07-13 Michael B. Baer

Dynamic Shannon Coding

We present a new algorithm for dynamic prefix-free coding, based on Shannon coding. We give a simple analysis and prove a better upper bound on the length of the encoding produced than the corresponding bound for dynamic Huffman coding. We…

Information Theory · Computer Science 2007-07-16 Travis Gagie

New Algorithms and Lower Bounds for Sequential-Access Data Compression

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…

Information Theory · Computer Science 2009-02-03 Travis Gagie

Twenty (or so) Questions: $D$-ary Length-Bounded Prefix Coding

Efficient optimal prefix coding has long been accomplished via the Huffman algorithm. However, there is still room for improvement and exploration regarding variants of the Huffman problem. Length-limited Huffman coding, useful for many…

Information Theory · Computer Science 2007-07-13 Michael B. Baer

Longest Common Prefixes with $k$-Errors and Applications

Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. In this paper, we…

Data Structures and Algorithms · Computer Science 2018-01-16 Lorraine A. K. Ayad , Panagiotis Charalampopoulos , Costas S. Iliopoulos , Solon P. Pissis