Related papers: Pushdown Compression

Polylog space compression, pushdown compression, and Lempel-Ziv are incomparable

The pressing need for efficient compression schemes for XML documents has recently been focused on stack computation, and in particular calls for a formulation of information-lossless stack or pushdown compressors that allows a formal…

Computational Complexity · Computer Science 2009-03-25 Elvira Mayordomo , Philippe Moser , Sylvain Perifel

Pushdown and Lempel-Ziv Depth

This paper expands upon existing and introduces new formulations of Bennett's logical depth. In previously published work by Jordon and Moser, notions of finite-state depth and pushdown depth were examined and compared. These were based on…

Computational Complexity · Computer Science 2022-01-20 Liam Jordon , Philippe Moser

Compression with the tudocomp Framework

We present a framework facilitating the implementation and comparison of text compression algorithms. We evaluate its features by a case study on two novel compression algorithms based on the Lempel-Ziv compression schemes that perform well…

Data Structures and Algorithms · Computer Science 2021-04-23 Patrick Dinklage , Johannes Fischer , Dominik Köppl , Marvin Löbel , Kunihiko Sadakane

Bit-Optimal Lempel-Ziv compression

One of the most famous and investigated lossless data-compression scheme is the one introduced by Lempel and Ziv about 40 years ago. This compression scheme is known as "dictionary-based compression" and consists of squeezing an input…

Data Structures and Algorithms · Computer Science 2008-02-07 Paolo Ferragina , Igor Nitto , Rossano Venturini

Bounded Pushdown dimension vs Lempel Ziv information density

In this paper we introduce a variant of pushdown dimension called bounded pushdown (BPD) dimension, that measures the density of information contained in a sequence, relative to a BPD automata, i.e. a finite state machine equipped with an…

Computational Complexity · Computer Science 2007-07-13 Pilar Albert , Elvira Mayordomo , Philippe Moser

LZD-style Compression Scheme with Truncation and Repetitions

Lempel-Ziv-Double (LZD) is a variation of the LZ78 compression scheme that achieves better compression on repetitive datasets. Nevertheless, prior research has identified computational inefficiencies and a weakness in its compressibility…

Data Structures and Algorithms · Computer Science 2025-05-05 Linus Götz , Dominik Köppl

LZ-Compressed String Dictionaries

We show how to compress string dictionaries using the Lempel-Ziv (LZ78) data compression algorithm. Our approach is validated experimentally on dictionaries of up to 1.5 GB of uncompressed text. We achieve compression ratios often…

Data Structures and Algorithms · Computer Science 2013-05-06 Julian Arz , Johannes Fischer

Lempel-Ziv-like Parsing in Small Space

Lempel-Ziv (LZ77 or, briefly, LZ) is one of the most effective and widely-used compressors for repetitive texts. However, the existing efficient methods computing the exact LZ parsing have to use linear or close to linear space to index the…

Data Structures and Algorithms · Computer Science 2020-05-12 Dmitry Kosolobov , Daniel Valenzuela , Gonzalo Navarro , Simon J. Puglisi

DZip: improved general-purpose lossless compression based on novel neural network modeling

We consider lossless compression based on statistical data modeling followed by prediction-based encoding, where an accurate statistical model for the input data leads to substantial improvements in compression. We propose DZip, a…

Machine Learning · Computer Science 2020-09-21 Mohit Goyal , Kedar Tatwawadi , Shubham Chandak , Idoia Ochoa

A High-Throughput Hardware Accelerator for Lempel-Ziv 4 Compression Algorithm

This paper delves into recent hardware implementations of the Lempel-Ziv 4 (LZ4) algorithm, highlighting two key factors that limit the throughput of single-kernel compressors. Firstly, the actual parallelism exhibited in single-kernel…

Hardware Architecture · Computer Science 2024-09-20 Tao Chen , Suwen Song , Zhongfeng Wang

Optimal Universal Lossless Compression with Side Information

This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor have access to the same side information. We propose a fixed-length-parsing LZ algorithm with side information, motivated…

Information Theory · Computer Science 2017-07-19 Yeohee Im , Sergio Verdú

A Family of LZ78-based Universal Sequential Probability Assignments

We propose and study a family of universal sequential probability assignments on individual sequences, based on the incremental parsing procedure of the Lempel-Ziv (LZ78) compression algorithm. We show that the normalized log loss under any…

Information Theory · Computer Science 2025-12-15 Naomi Sagan , Tsachy Weissman

Universal Indexes for Highly Repetitive Document Collections

Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that…

Information Retrieval · Computer Science 2016-05-25 Francisco Claude , Antonio Fariña , Miguel A. Martínez-Prieto , Gonzalo Navarro

Lossy Compression in Near-Linear Time via Efficient Random Codebooks and Databases

The compression-complexity trade-off of lossy compression algorithms that are based on a random codebook or a random database is examined. Motivated, in part, by recent results of Gupta-Verd\'{u}-Weissman (GVW) and their underlying…

Information Theory · Computer Science 2009-04-23 Chris Gioran , Ioannis Kontoyiannis

Sublinear Algorithms for Approximating String Compressibility

We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE)…

Data Structures and Algorithms · Computer Science 2007-06-11 Sofya Raskhodnikova , Dana Ron , Ronitt Rubinfeld , Adam Smith

A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support

We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed…

Data Structures and Algorithms · Computer Science 2013-01-14 Akashnil Dutta , Reut Levi , Dana Ron , Ronitt Rubinfeld

Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections

Compression techniques that support fast random access are a core component of any information system. Current state-of-the-art methods group documents into fixed-sized blocks and compress each block with a general-purpose adaptive…

Data Structures and Algorithms · Computer Science 2015-03-19 Christopher Hoobin , Simon J. Puglisi , Justin Zobel

LZ78 Substring Compression in Compressed Space

The Lempel--Ziv 78 (LZ78) factorization is a well-studied technique for data compression. It and its derivatives are used in compression formats such as "compress" or "gif". Although most research focuses on the factorization of plain data,…

Data Structures and Algorithms · Computer Science 2025-12-22 Hiroki Shibata , Dominik Köppl

Self-Index based on LZ77 (thesis)

Domains like bioinformatics, version control systems, collaborative editing systems (wiki), and others, are producing huge data collections that are very repetitive. That is, there are few differences between the elements of the collection.…

Data Structures and Algorithms · Computer Science 2011-12-21 Sebastian Kreft , Gonzalo Navarro

Time and Memory Efficient Lempel-Ziv Compression Using Suffix Arrays

The well-known dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are the basis of several universal lossless compression techniques. These algorithms are asymmetric regarding encoding/decoding time and memory requirements, with…

Data Structures and Algorithms · Computer Science 2009-12-31 Artur Ferreira , Arlindo Oliveira , Mario Figueiredo