Related papers: Fast Codes for Large Alphabets
A novel fast recursive coding technique is proposed. It operates with only integer values not longer 8 bits and is multiplication free. Recursion the algorithm is based on indirectly provides rather effective coding of symbols for very…
The order of letters is not always relevant in a communication task. This paper discusses the implications of order irrelevance on source coding, presenting results in several major branches of source coding theory: lossless coding,…
Large alphabet source coding is a basic and well-studied problem in data compression. It has many applications such as compression of natural language text, speech and images. The classic perception of most commonly used methods is that a…
We address the recently suggested problem of causal lossless coding of a randomly arriving source samples. We construct variable-to-fixed coding schemes and show that they outperform the previously considered fixed-to-variable schemes when…
Motivated from the fact that universal source coding on countably infinite alphabets is not feasible, this work introduces the notion of almost lossless source coding. Analog to the weak variable-length source coding problem studied by Han…
There is a large literature devoted to the problem of finding an optimal (min-cost) prefix-free code with an unequal letter-cost encoding alphabet of size. While there is no known polynomial time algorithm for solving it optimally there are…
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression…
Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive…
The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process…
This paper presents lossless prefix codes optimized with respect to a pay-off criterion consisting of a convex combination of maximum codeword length and average codeword length. The optimal codeword lengths obtained are based on a new…
We study the new problem of Huffman-like codes subject to individual restrictions on the code-word lengths of a subset of the source words. These are prefix codes with minimal expected code-word length for a random source where additionally…
This short paper describes a simple coding technique, Sparse Sequential Dirichlet Coding, for multi-alphabet memoryless sources. It is appropriate in situations where only a small, unknown subset of the possible alphabet symbols can be…
This paper presents prefix codes which minimize various criteria constructed as a convex combination of maximum codeword length and average codeword length or maximum redundancy and average redundancy, including a convex combination of the…
Universal fixed-to-variable lossless source coding for memoryless sources is studied in the finite blocklength and higher-order asymptotics regimes. Optimal third-order coding rates are derived for general fixed-to-variable codes and for…
We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a…
In this paper, we generalize the well-known index coding problem to exploit the structure in the source-data to improve system throughput. In many applications, the data to be transmitted may lie (or can be well approximated) in a…
Let $P = \{p(i)\}$ be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial $P$ for which known methods find a…
This paper presents a new approach for code similarity on High Level programs. Our technique is based on Fast Dynamic Time Warping, that builds a warp path or points relation with local restrictions. The source code is represented into Time…
For the discrete memoryless sources with a countably infinite alphabet, we prove that for any positive integer $k$, there exists a corresponding probability interval such that if the largest symbol probability $p_{1}$ falls in this…
Adaptive coding faces the following problem: given a collection of source classes such that each class in the collection has non-trivial minimax redundancy rate, can we design a single code which is asymptotically minimax over each class in…