Related papers: About adaptive coding on countable alphabets
Adaptive coding faces the following problem: given a collection of source classes such that each class in the collection has non-trivial minimax redundancy rate, can we design a single code which is asymptotically minimax over each class in…
This paper describes universal lossless coding strategies for compressing sources on countably infinite alphabets. Classes of memoryless sources defined by an envelope condition on the marginal distribution provide benchmarks for coding…
In this paper, we study the problem of lossless universal source coding for stationary memoryless sources on countably infinite alphabets. This task is generally not achievable without restricting the class of sources over which…
This paper deals with the problem of universal lossless coding on a countable infinite alphabet. It focuses on some classes of sources defined by an envelope condition on the marginal distribution, namely exponentially decreasing envelope…
Motivated from the fact that universal source coding on countably infinite alphabets is not feasible, this work introduces the notion of almost lossless source coding. Analog to the weak variable-length source coding problem studied by Han…
Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive…
We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a…
The Poisson-sampling technique eliminates dependencies among symbol appearances in a random sequence. It has been used to simplify the analysis and strengthen the performance guarantees of randomized algorithms. Applying this method to…
The minimum average number of bits need to describe a random variable is its entropy, assuming knowledge of the underlying statistics On the other hand, universal compression supposes that the distribution of the random variable, while…
This paper describes design of a low-complexity algorithm for adaptive encoding/ decoding of binary sequences produced by memoryless sources. The algorithm implements universal block codes constructed for a set of contexts identified by the…
We study universal compression of sequences generated by monotonic distributions. We show that for a monotonic distribution over an alphabet of size $k$, each probability parameter costs essentially $0.5 \log (n/k^3)$ bits, where $n$ is the…
We consider universal variable-to-fixed length compression of memoryless sources with a fidelity criterion. We design a dictionary codebook over the reproduction alphabet which is used to parse the source stream. Once a source subsequence…
We prove the existence of codebooks for d-semifaithful lossy compression that are simultaneously universal with respect to both the class of finite-alphabet memoryless sources and the class of all bounded additive distortion measures. By…
As it is known, universal codes, which estimate the entropy rate consistently, exist for stationary ergodic sources over finite alphabets but not over countably infinite ones. We generalize universal coding as the problem of universal…
In this paper we consider the class of anti-uniform Huffman codes and derive tight lower and upper bounds on the average length, entropy, and redundancy of such codes in terms of the alphabet size of the source. The Fibonacci distributions…
Universal fixed-to-variable lossless source coding for memoryless sources is studied in the finite blocklength and higher-order asymptotics regimes. Optimal third-order coding rates are derived for general fixed-to-variable codes and for…
This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…
Transformer architectures have achieved state-of-the-art results on a variety of sequence modeling tasks. However, their attention mechanism comes with a quadratic complexity in sequence lengths, making the computational overhead…
We present new lower and upper bounds for the compression rate of binary prefix codes optimized over memoryless sources according to two related exponential codeword length objectives. The objectives explored here are exponential-average…
In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover we…