Related papers: Investigations on Algorithm Selection for Interval…
The transmission or storage of signals typically involves data compression. The final processing step in compression systems is generally an entropy coding stage, which converts symbols into a bit stream based on their probability…
In this paper will be presented new approach to entropy coding: family of generalizations of standard numeral systems which are optimal for encoding sequence of equiprobable symbols, into asymmetric numeral systems - optimal for freely…
The modern data compression is mainly based on two approaches to entropy coding: Huffman (HC) and arithmetic/range coding (AC). The former is much faster, but approximates probabilities with powers of 2, usually leading to relatively low…
This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of…
Large alphabet source coding is a basic and well-studied problem in data compression. It has many applications such as compression of natural language text, speech and images. The classic perception of most commonly used methods is that a…
Compression is a technique to reduce the quantity of data without excessively reducing the quality of the multimedia data. The transition and storing of compressed multimedia data is much faster and more efficient than original uncompressed…
Grammar compression represents a string as a context free grammar. Achieving compression requires encoding such grammar as a binary string; there are a few commonly used encodings. We bound the size of practically used encodings for several…
In this paper, we generalize the well-known index coding problem to exploit the structure in the source-data to improve system throughput. In many applications, the data to be transmitted may lie (or can be well approximated) in a…
Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…
A novel adaptive binary decoding algorithm for LDPC codes is proposed, which reduces the decoding complexity while having a comparable or even better performance than corresponding non-adaptive alternatives. In each iteration the variable…
Huffman encoding is often improved by using block codes, for example a 3-block would be an alphabet consisting of each possible combination of three characters. We take the approach of starting with a base alphabet and expanding it to…
We study the following one-way asymmetric transmission problem, also a variant of model-based compressed sensing: a resource-limited encoder has to report a small set $S$ from a universe of $N$ items to a more powerful decoder (server). The…
This paper presents encoding and decoding algorithms for several families of optimal rank metric codes whose codes are in restricted forms of symmetric, alternating and Hermitian matrices. First, we show the evaluation encoding is the right…
Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…
Adaptive variable-length codes associate a variable-length codeword to the symbol being encoded depending on the previous symbols in the input string. This class of codes has been recently presented in [Dragos Trinca, arXiv:cs.DS/0505007]…
Non-uniquely decodable codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non-prefix-free codes, where a codeword can be a prefix of other(s), and…
Many data compressors regularly encode probability distributions for entropy coding - requiring minimal description length type of optimizations. Canonical prefix/Huffman coding usually just writes lengths of bit sequences, this way…
It is common for search and optimization problems to have alternative equivalent encodings in ASP. Typically none of them is uniformly better than others when evaluated on broad classes of problem instances. We claim that one can improve…
Many efforts have been devoted to develop alternative methods to traditional vector quantization in image domain such as sparse coding and soft-assignment. These approaches can be split into a dictionary learning phase and a feature…
We consider the problem of constructing prefix-free codes in which a designated symbol, a space, can only appear at the end of codewords. We provide a linear-time algorithm to construct almost-optimal codes with this property, meaning that…