Related papers: Redundancy Estimates for Word-Based Encoding of Se…
The penalty incurred by imposing a finite delay constraint in lossless source coding of a memoryless source is investigated. It is well known that for the so-called block-to-variable and variable-to-variable codes, the redundancy decays at…
This paper presents new lower and upper bounds for the compression rate of binary prefix codes optimized over memoryless sources according to various nonlinear codeword length objectives. Like the most well-known redundancy bounds for…
Determining the largest size, or equivalently finding the lowest redundancy, of q-ary codes for given length and minimum distance is one of the central and fundamental problems in coding theory. Inspired by the construction of…
Consider a binary word being transmitted through a communication channel that introduces deletable errors where each bit of the word is either retained, flipped, erased or deleted. The simplest code for correcting \emph{all} possible…
We study the effects of finite-precision representation of source's probabilities on the efficiency of classic source coding algorithms, such as Shannon, Gilbert-Moore, or arithmetic codes. In particular, we establish the following simple…
We study universal compression of sequences generated by monotonic distributions. We show that for a monotonic distribution over an alphabet of size $k$, each probability parameter costs essentially $0.5 \log (n/k^3)$ bits, where $n$ is the…
The minimum average number of bits need to describe a random variable is its entropy, assuming knowledge of the underlying statistics On the other hand, universal compression supposes that the distribution of the random variable, while…
We present new lower and upper bounds for the compression rate of binary prefix codes optimized over memoryless sources according to two related exponential codeword length objectives. The objectives explored here are exponential-average…
We generalize the notion of the stopping redundancy in order to study the smallest size of a trapping set in Tanner graphs of linear block codes. In this context, we introduce the notion of the trapping redundancy of a code, which…
Redundancy of experimental data is the basic statistic from which the complexity of a natural phenomenon and the proper number of experiments needed for its exploration can be estimated. The redundancy is expressed by the entropy of…
In this paper, we study the redundancy of linear codes with graph constraints. First we consider linear parity check codes based on bipartite graphs with diversity and with generalized graph constraints. We describe sufficient conditions on…
This paper presents new lower and upper bounds for the optimal compression of binary prefix codes in terms of the most probable input symbol, where compression efficiency is determined by the nonlinear codeword length objective of…
For a given stable recurrent neural network (RNN) that is trained to perform a classification task using sequential inputs, we quantify explicit robustness bounds as a function of trainable weight matrices. The sequential inputs can be…
The performance of an error correcting code is evaluated by its error probability, rate, and en/decoding complexity. The performance of a series of codes is evaluated by, as the block lengths approach infinity, whether their error…
Code-trained language models have proven to be highly effective for various code intelligence tasks. However, they can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory…
Consider the set of source distributions within a fixed maximum relative entropy with respect to a given nominal distribution. Lossless source coding over this relative entropy ball can be approached in more than one way. A problem…
We study and propose schemes that map messages onto constant-weight codewords using variable-length prefixes. We provide polynomial-time computable formulas that estimate the average number of redundant bits incurred by our schemes. In…
A classic result in algorithmic information theory is that every infinite binary sequence is computable from a Martin-Loef random infinite binary sequence. Proved independently by Kucera and Gacs, this result answered a question by Charles…
A variable-length code is a fix-free code if no codeword is a prefix or a suffix of any other codeword. In a fix-free code any finite sequence of codewords can be decoded in both directions, which can improve the robustness to channel noise…
Batch codes are a useful notion of locality for error correcting codes, originally introduced in the context of distributed storage and cryptography. Many constructions of batch codes have been given, but few lower bound (limitation)…