Related papers: Enhancing Dictionary Based Preprocessing For Bette…
A weight based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by…
In 1994, Burrows and Wheeler developed a data compression algorithm which performs significantly better than Lempel-Ziv based algorithms. Since then, a lot of work has been done in order to improve their algorithm, which is based on a…
The Burrows-Wheeler-Transform (BWT) is an invertible permutation of a text known to be highly compressible but also useful for sequence analysis, what makes the BWT highly attractive for lossless data compression. In this paper, we present…
Given a string of characters, the Burrows-Wheeler Transform rearranges the characters in it so as to produce another string of the same length which is more amenable to compression techniques such as move to front, run-length encoding, and…
Data compression is very important feature in terms of saving the memory space. In this proposal, an indexed dictionary based compression is used for text data, where the word's reference in dictionary is used for compression. This approach…
The paper introduces a new lossless, highly robust compression algorithm that similar with LZW algorithm, yet the algorithm discards dictionary processing and uses irregular sequences with massive, random information instead. Then the paper…
Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…
We propose a special-purpose class of compression algorithms for efficient compression of Prolog programs. It is a dictionary-based compression method, specially designed for the compression of Prolog code, and therefore we name it PCA…
The Burrows-Wheeler transform (BWT) is a reversible transform that converts a string $w$ into another string $\mathsf{BWT}(w)$. The size of the run-length encoded BWT (RLBWT) can be interpreted as a measure of repetitiveness in the class of…
Today there are many universal compression algorithms, but in most cases is for specific data better using specific algorithm - JPEG for images, MPEG for movies, etc. For textual documents there are special methods based on PPM algorithm or…
Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being…
Text compression shrinks textual data while keeping crucial information, eradicating constraints on storage, bandwidth, and computational efficacy. The integration of lossless compression techniques with transformer-based text decompression…
Introduced about thirty years ago in the field of Data Compression, the Burrows-Wheeler Transform (BWT) is a string transformation that, besides being a booster of the performance of memoryless compressors, plays a fundamental role in the…
The Burrows-Wheeler Transform (BWT) is a fundamental component in many data structures for text indexing and compression, widely used in areas such as bioinformatics and information retrieval. The extended BWT (eBWT) generalizes the…
We present a new semi-external algorithm that builds the Burrows--Wheeler transform variant of Bauer et al. (a.k.a., BCR BWT) in linear expected time. Our method uses compression techniques to reduce computational costs when the input is…
Machine-generated data is rapidly growing and poses challenges for data-intensive systems, especially as the growth of data outpaces the growth of storage space. To cope with the storage issue, compression plays a critical role in storage…
The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the…
In order to avoid the reference bias introduced by mapping reads to a reference genome, bioinformaticians are investigating reference-free methods for analyzing sequenced genomes. With large projects sequencing thousands of individuals,…
In this paper we propose a flexible and lightweight technique for merging compressed indices based on variants of Burrows-Wheeler transform (BWT), thus addressing the need for algorithms that compute compressed indices over large…
Many services today massively and continuously produce log files of different and varying formats. These logs are important since they contain information about the application activities, which is necessary for improvements by analyzing…