Related papers: PivotCompress: Compression by Sorting

Combinatorial Entropy Encoding

This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of…

Information Theory · Computer Science 2017-03-24 Abu Bakar Siddique

Compression of sources of probability distributions and density operators

We study the problem of efficient compression of a stochastic source of probability distributions. It can be viewed as a generalization of Shannon's source coding problem. It has relation to the theory of common randomness, as well as to…

Quantum Physics · Physics 2016-09-08 Andreas Winter

Compression in the Space of Permutations

We investigate lossy compression (source coding) of data in the form of permutations. This problem has direct applications in the storage of ordinal data or rankings, and in the analysis of sorting algorithms. We analyze the rate-distortion…

Information Theory · Computer Science 2016-11-18 Da Wang , Arya Mazumdar , Gregory Wornell

Coding sets with asymmetric information

We study the following one-way asymmetric transmission problem, also a variant of model-based compressed sensing: a resource-limited encoder has to report a small set $S$ from a universe of $N$ items to a more powerful decoder (server). The…

Data Structures and Algorithms · Computer Science 2018-07-30 Alexandr Andoni , Javad Ghaderi , Daniel Hsu , Dan Rubenstein , Omri Weinstein

Integer Set Compression and Statistical Modeling

Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher…

Information Theory · Computer Science 2014-02-11 N. Jesper Larsson

On principles of large deviation and selected data compression

The Shannon Noiseless coding theorem (the data-compression principle) asserts that for an information source with an alphabet $\mathcal X=\{0,\ldots ,\ell -1\}$ and an asymptotic equipartition property, one can reduce the number of stored…

Information Theory · Computer Science 2016-04-26 Yuri Suhov , Izabella Stuhl

Applications of Universal Source Coding to Statistical Analysis of Time Series

We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a…

Information Theory · Computer Science 2008-09-09 Boris Ryabko

Compressing combinatorial objects

Most of the world's digital data is currently encoded in a sequential form, and compression methods for sequences have been studied extensively. However, there are many types of non-sequential data for which good compression techniques are…

Information Theory · Computer Science 2016-01-15 Christian Steinruecken

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the symbol-level statistics, with the information source modeled as a long sequence of i.i.d. random variables or a stochastic process, thus establishing the fundamental…

Computation and Language · Computer Science 2023-04-04 Mingxiao Li , Rui Jin , Liyao Xiang , Kaiming Shen , Shuguang Cui

Distributional convergence for the number of symbol comparisons used by QuickSort

Most previous studies of the sorting algorithm QuickSort have used the number of key comparisons as a measure of the cost of executing the algorithm. Here we suppose that the n independent and identically distributed (i.i.d.) keys are each…

Probability · Mathematics 2013-03-14 James Allen Fill

Preprocessing operations and the reverse compression

The task of compression of data -- as stated by the source coding theorem -- is one of the cornerstones of information theory. Data compression usually exploits statistical redundancies in the data according to its prior distribution.…

Quantum Physics · Physics 2021-01-08 Matheus Capela , Fabio Costa

Data Compression with Prime Numbers

A compression algorithm is presented that uses the set of prime numbers. Sequences of numbers are correlated with the prime numbers, and labeled with the integers. The algorithm can be iterated on data sets, generating factors of doubles on…

General Physics · Physics 2007-05-23 Gordon Chalmers

Time-universal data compression and prediction

Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest…

Information Theory · Computer Science 2018-09-11 Boris Ryabko

Lossy data compression with random gates

We introduce a new protocol for a lossy data compression algorithm which is based on constraint satisfaction gates. We show that the theoretical capacity of algorithms built from standard parity-check gates converges exponentially fast to…

Disordered Systems and Neural Networks · Physics 2009-11-11 S. Ciliberti , M. Mezard , R. Zecchina

Prediction by Compression

It is well known that text compression can be achieved by predicting the next symbol in the stream of text data based on the history seen up to the current symbol. The better the prediction the more skewed the conditional probability…

Information Theory · Computer Science 2010-08-31 Joel Ratsaby

Compression and information entropy of binary strings from the collision history of three hard balls

We investigate how to measure and define the entropy of a simple chaotic system, three hard spheres on a ring. A novel approach is presented, which does not assume the ergodic hypothesis. It consists of transforming the particles collision…

Computational Physics · Physics 2023-05-08 Matej Vedak , Graeme J Ackland

Compressed Key Sort and Fast Index Reconstruction

In this paper we propose an index key compression scheme based on the notion of distinction bits by proving that the distinction bits of index keys are sufficient information to determine the sorted order of the index keys correctly. While…

Databases · Computer Science 2020-09-25 Yongsik Kwon , Cheol Ryu , Sang Kyun Cha , Arthur H. Lee , Kunsoo Park , Bongki Moon

Permutation Entropy for Signal Analysis

Shannon Entropy is the preeminent tool for measuring the level of uncertainty (and conversely, information content) in a random variable. In the field of communications, entropy can be used to express the information content of given…

Information Theory · Computer Science 2024-11-06 Bill Kay , Audun Myers , Thad Boydston , Emily Ellwein , Cameron Mackenzie , Iliana Alvarez , Erik Lentz

Cryptographic Compression

We introduce a protocol called ENCORE which simultaneously compresses and encrypts data in a one-pass process that can be implemented efficiently and possesses a number of desirable features as a streaming encoder/decoder. Motivated by the…

Cryptography and Security · Computer Science 2025-01-28 Joshua Cooper , Grant Fickes

The Sample Complexity of Lossless Data Compression

A new framework is introduced for examining and evaluating the fundamental limits of lossless data compression, that emphasizes genuinely non-asymptotic results. The {\em sample complexity} of compressing a given source is defined as the…

Information Theory · Computer Science 2026-04-16 Terence Viaud , Ioannis Kontoyiannis