Related papers: EAH: A New Encoder based on Adaptive Variable-leng…
Adaptive codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the input data string. This class of codes has been introduced in [Dragos Trinca, cs.DS/0505007] as a new class of non-standard…
Adaptive codes have been introduced in [Dragos Trinca, cs.DS/0505007] as a new class of non-standard variable-length codes. These codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the…
Adaptive (variable-length) codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the input data string. This class of codes has been presented in [Dragos Trinca, cs.DS/0505007] as a new…
We introduce a new class of non-standard variable-length codes, called adaptive codes. This class of codes associates a variable-length codeword to the symbol being encoded depending on the previous symbols in the input data string. An…
Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least…
This paper describes a new set of block source codes well suited for data compression. These codes are defined by sets of productions rules of the form a.l->b, where a in A represents a value from the source alphabet A and l, b are -small-…
In this paper we consider the class of anti-uniform Huffman codes and derive tight lower and upper bounds on the average length, entropy, and redundancy of such codes in terms of the alphabet size of the source. The Fibonacci distributions…
We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our…
This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of…
Huffman compression is a statistical, lossless, data compression algorithm that compresses data by assigning variable length codes to symbols, with the more frequently appearing symbols given shorter codes than the less. This work is a…
Huffman encoding is often improved by using block codes, for example a 3-block would be an alphabet consisting of each possible combination of three characters. We take the approach of starting with a base alphabet and expanding it to…
There is a class of entropy-coding methods which do not substitute symbols by code words (such as Huffman coding), but operate on intervals or ranges. This class includes three prominent members: conventional arithmetic coding, range…
In this paper we study the adaptive prefix coding problem in cases where the size of the input alphabet is large. We present an online prefix coding algorithm that uses $O(\sigma^{1 / \lambda + \epsilon}) $ bits of space for any constants…
This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…
Describes a near-linear-time algorithm for a variant of Huffman coding, in which the letters may have non-uniform lengths (as in Morse code), but with the restriction that each word to be encoded has equal probability. [See also ``Huffman…
This paper presents new lower and upper bounds for the compression rate of binary prefix codes optimized over memoryless sources according to various nonlinear codeword length objectives. Like the most well-known redundancy bounds for…
We present a new data structure called the \emph{Compressed Random Access Memory} (CRAM) that can store a dynamic string $T$ of characters, e.g., representing the memory of a computer, in compressed form while achieving asymptotically…
Huffman Compression, also known as Huffman Coding, is one of many compression techniques in use today. The two important features of Huffman coding are instantaneousness that is the codes can be interpreted as soon as they are received and…
Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…
The modern data compression is mainly based on two approaches to entropy coding: Huffman (HC) and arithmetic/range coding (AC). The former is much faster, but approximates probabilities with powers of 2, usually leading to relatively low…