Related papers: Compressing combinatorial objects

Integer Set Compression and Statistical Modeling

Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher…

Information Theory · Computer Science 2014-02-11 N. Jesper Larsson

An Introduction to Neural Data Compression

Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression…

Machine Learning · Computer Science 2023-08-22 Yibo Yang , Stephan Mandt , Lucas Theis

Data Compression with Prime Numbers

A compression algorithm is presented that uses the set of prime numbers. Sequences of numbers are correlated with the prime numbers, and labeled with the integers. The algorithm can be iterated on data sets, generating factors of doubles on…

General Physics · Physics 2007-05-23 Gordon Chalmers

Applying Data Compression Techniques on Systolic Neural Network Accelerator

New directions in computing and algorithms has lead to some new applications that have tolerance to imprecision. Although, These applications are creating large volumes of data which exceeds the capability of today's computing systems.…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-01-16 Navid Mirnouri

Random Permutation Codes: Lossless Source Coding of Non-Sequential Data

This thesis deals with the problem of communicating and storing non-sequential data. We investigate this problem through the lens of lossless source coding, also sometimes referred to as lossless compression, from both an algorithmic and…

Information Theory · Computer Science 2024-11-25 Daniel Severo

Compressing Sets and Multisets of Sequences

This article describes lossless compression algorithms for multisets of sequences, taking advantage of the multiset's unordered structure. Multisets are a generalisation of sets where members are allowed to occur multiple times. A multiset…

Information Theory · Computer Science 2014-01-27 Christian Steinruecken

Sequential Universal Modeling for Non-Binary Sequences with Constrained Distributions

Sequential probability assignment and universal compression go hand in hand. We propose sequential probability assignment for non-binary (and large alphabet) sequences with empirical distributions whose parameters are known to be bounded…

Information Theory · Computer Science 2021-02-09 Michael Drmota , Gil Shamir , Wojciech Szpankowski

Compression-based methods for nonparametric density estimation, on-line prediction, regression and classification for time series

We address the problem of nonparametric estimation of characteristics for stationary and ergodic time series. We consider finite-alphabet time series and real-valued ones and the following four problems: i) estimation of the (limiting)…

Information Theory · Computer Science 2007-11-01 Boris Ryabko

Critical Data Compression

A new approach to data compression is developed and applied to multimedia content. This method separates messages into components suitable for both lossless coding and 'lossy' or statistical coding techniques, compressing complex objects by…

Information Theory · Computer Science 2011-12-26 John Scoville

A Compression Algorithm Using Mis-aligned Side-information

We study the problem of compressing a source sequence in the presence of side-information that is related to the source via insertions, deletions and substitutions. We propose a simple algorithm to compress the source sequence when the…

Information Theory · Computer Science 2016-11-15 Nan Ma , Kannan Ramchandran , David Tse

Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

Compressed sensing is a signal processing method that acquires data directly in a compressed form. This allows one to make less measurements than what was considered necessary to record a signal, enabling faster or more precise measurement…

Statistical Mechanics · Physics 2012-08-20 Florent Krzakala , Marc Mézard , François Sausset , Yifan Sun , Lenka Zdeborová

The Minimal Compression Rate for Similarity Identification

Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case…

Information Theory · Computer Science 2013-12-10 Amir Ingber , Tsachy Weissman

Near Lossless Time Series Data Compression Methods using Statistics and Deviation

The last two decades have seen tremendous growth in data collections because of the realization of recent technologies, including the internet of things (IoT), E-Health, industrial IoT 4.0, autonomous vehicles, etc. The challenge of data…

Information Theory · Computer Science 2022-10-03 Vidhi Agrawal , Gajraj Kuldeep , Dhananjoy Dey

Compressing Multisets with Large Alphabets using Bits-Back Coding

Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression…

Information Theory · Computer Science 2023-02-28 Daniel Severo , James Townsend , Ashish Khisti , Alireza Makhzani , Karen Ullrich

Entropy Coding of Unordered Data Structures

We present shuffle coding, a general method for optimal compression of sequences of unordered objects using bits-back coding. Data structures that can be compressed using shuffle coding include multisets, graphs, hypergraphs, and others. We…

Machine Learning · Computer Science 2024-08-19 Julius Kunze , Daniel Severo , Giulio Zani , Jan-Willem van de Meent , James Townsend

DeepZip: Lossless Data Compression using Recurrent Neural Networks

Sequential data is being generated at an unprecedented pace in various forms, including text and genomic data. This creates the need for efficient compression mechanisms to enable better storage, transmission and processing of such data. To…

Computation and Language · Computer Science 2018-11-21 Mohit Goyal , Kedar Tatwawadi , Shubham Chandak , Idoia Ochoa

Efficient Compression of Long Arbitrary Sequences with No Reference at the Encoder

In a distributed information application an encoder compresses an arbitrary vector while a similar reference vector is available to the decoder as side information. For the Hamming-distance similarity measure, and when guaranteed perfect…

Information Theory · Computer Science 2020-09-08 Yuval Cassuto , Jacob Ziv

Compression Algorithm Based on Irregular Sequence

The paper introduces a new lossless, highly robust compression algorithm that similar with LZW algorithm, yet the algorithm discards dictionary processing and uses irregular sequences with massive, random information instead. Then the paper…

Signal Processing · Electrical Eng. & Systems 2020-06-24 Rui Zhu

Time-universal data compression and prediction

Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest…

Information Theory · Computer Science 2018-09-11 Boris Ryabko

Compressive Mining: Fast and Optimal Data Mining in the Compressed Domain

Real-world data typically contain repeated and periodic patterns. This suggests that they can be effectively represented and compressed using only a few coefficients of an appropriate basis (e.g., Fourier, Wavelets, etc.). However, distance…

Machine Learning · Statistics 2014-05-26 Michail Vlachos , Nikolaos Freris , Anastasios Kyrillidis