Related papers: Efficient Compressed Wavelet Trees over Large Alph…
The wavelet tree (Grossi et al. [SODA, 2003]) and wavelet matrix (Claude et al. [Inf. Syst., 47:15--32, 2015]) are compact indices for texts over an alphabet $[0,\sigma)$ that support rank, select and access queries in $O(\lg \sigma)$ time.…
Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the…
An indexed sequence of strings is a data structure for storing a string sequence that supports random access, searching, range counting and analytics operations, both for exact matches and prefix search. String sequences lie at the core of…
The wavelet tree (Grossi et al. [SODA, 2003]) and wavelet matrix (Claude et al. [Inf. Syst., 2015]) are compact data structures with many applications such as text indexing or computational geometry. By continuing the recent research of…
In Image Compression, the researchers' aim is to reduce the number of bits required to represent an image by removing the spatial and spectral redundancies. Recently discrete wavelet transform and wavelet packet has emerged as popular…
A wavelet forest for a text $T [1..n]$ over an alphabet $\sigma$ takes $n H_0 (T) + o (n \log \sigma)$ bits of space and supports access and rank on $T$ in $O (\log \sigma)$ time. K\"arkk\"ainen and Puglisi (2011) implicitly introduced…
The wavelet tree has become a very useful data structure to efficiently represent and query large volumes of data in many different domains, from bioinformatics to geographic information systems. One problem with wavelet trees is their…
Rank and select queries are basic operations on sequences, with applications in compressed text indexes and other space-efficient data structures. One of the standard data structures supporting these queries is the wavelet tree. In this…
We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings. Given a string of length n over an alphabet of size $\sigma\leq n$, our method builds…
We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…
Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into…
Can we do arithmetic in a completely different way, with a radically different data structure? Could this approach provide practical benefits, like operations on giant numbers while having an average performance similar to traditional…
Wavelets are well known for data compression, yet have rarely been applied to the compression of neural networks. This paper shows how the fast wavelet transform can be used to compress linear layers in neural networks. Linear layers still…
Suffix trees are a fundamental data structure in stringology, but their space usage, though linear, is an important problem for its applications. We design and implement a new compressed suffix tree targeted to highly repetitive texts, such…
Large-alphabet strings are common in scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in…
In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often…
We present a data structure that stores a sequence $s[1..n]$ over alphabet $[1..\sigma]$ in $n\Ho(s) + o(n)(\Ho(s){+}1)$ bits, where $\Ho(s)$ is the zero-order entropy of $s$. This structure supports the queries \access, \rank\ and \select,…
Compressed file formats are the corner stone of efficient data storage and transmission, yet their potential for representation learning remains largely underexplored. We introduce TEMPEST (TransformErs froM comPressed rEpreSenTations), a…
The constant center frequency to bandwidth ratio (Q-factor) of wavelet transforms provides a very natural representation for audio data. However, invertible wavelet transforms have either required non-uniform decimation -- leading to…
We consider the problem of storing a dynamic string $S$ over an alphabet $\Sigma=\{\,1,\ldots,\sigma\,\}$ in compressed form. Our representation supports insertions and deletions of symbols and answers three fundamental queries:…