English
Related papers

Related papers: Grammar Compressed Sequences with Rank/Select Supp…

200 papers

We present a new graph compressor that works by recursively detecting repeated substructures and representing them through grammar rules. We show that for a large number of graphs the compressor obtains smaller representations than other…

Data Structures and Algorithms · Computer Science 2017-04-19 Sebastian Maneth , Fabian Peternek

Sequence representations supporting queries $access$, $select$ and $rank$ are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how…

Data Structures and Algorithms · Computer Science 2013-08-26 Djamal Belazzougui , Gonzalo Navarro

Grammar-based compression is a popular and powerful approach to compressing repetitive texts but until recently its relatively poor time-space trade-offs during real-life construction made it impractical for truly massive datasets such as…

Data Structures and Algorithms · Computer Science 2020-07-21 Travis Gagie , Tomohiro I , Giovanni Manzini , Gonzalo Navarro , Hiroshi Sakamoto , Louisa Seelbach Benkner , Yoshimasa Takabatake

We introduce the first grammar-compressed representation of a sequence that supports searches in time that depends only logarithmically on the size of the grammar. Given a text $T[1..u]$ that is represented by a (context-free) grammar of…

Data Structures and Algorithms · Computer Science 2011-10-21 Francisco Claude , Gonzalo Navarro

Rank and select queries on bitmaps are essential building bricks of many compressed data structures, including text indexes, membership and range supporting spatial data structures, compressed graphs, and more. Theoretically considered yet…

Data Structures and Algorithms · Computer Science 2016-05-13 Szymon Grabowski , Marcin Raniszewski

Given a string $S$ of length $N$ on a fixed alphabet of $\sigma$ symbols, a grammar compressor produces a context-free grammar $G$ of size $n$ that generates $S$ and only $S$. In this paper we describe data structures to support the…

Data Structures and Algorithms · Computer Science 2014-08-15 Djamal Belazzougui , Simon J. Puglisi , Yasuo Tabei

Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel…

Data Structures and Algorithms · Computer Science 2013-10-30 Philip Bille , Gad M. Landau , Rajeev Raman , Kunihiko Sadakane , Srinivasa Rao Satti , Oren Weimann

Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory…

Computation and Language · Computer Science 2022-01-11 Justin T. Chiu , Yuntian Deng , Alexander M. Rush

Large-alphabet strings are common in scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in…

Data Structures and Algorithms · Computer Science 2024-05-03 Diego Arroyuelo , Gabriel Carmona , Héctor Larrañaga , Francisco Riveros , Carlos Eugenio Rojas-Morales , Erick Sepúlveda

The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing…

Data Structures and Algorithms · Computer Science 2011-01-31 Nieves R. Brisaboa , Rodrigo Cánovas , Miguel A. Martínez-Prieto , Gonzalo Navarro

A grammar-compressed ranked tree is represented with a linear space overhead so that a single traversal step, i.e., the move to the parent or the i-th child, can be carried out in constant time. Moreover, we extend our data structure such…

Data Structures and Algorithms · Computer Science 2015-11-11 Markus Lohrey , Sebastian Maneth , Carl Philipp Reh

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

Neural networks using numerous text data have been successfully applied to a variety of tasks. While massive text data is usually compressed using techniques such as grammar compression, almost all of the previous machine learning methods…

Machine Learning · Statistics 2020-03-02 Yoichi Sasaki , Kosuke Akimoto , Takanori Maehara

We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix…

Data Structures and Algorithms · Computer Science 2019-11-20 Nieves R. Brisaboa , Ana Cerdeira-Pena , Guillermo de Bernardo , Gonzalo Navarro

Two decades ago, a breakthrough in indexing string collections made it possible to represent them within their compressed space while at the same time offering indexed search functionalities. As this new technology permeated through…

Data Structures and Algorithms · Computer Science 2022-11-28 Gonzalo Navarro

We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data…

Formal Languages and Automata Theory · Computer Science 2019-01-17 Pierre Ganty , Pedro Valero

This paper describes substantial advances in the analysis (parsing) of diagrams using constraint grammars. The addition of set types to the grammar and spatial indexing of the data make it possible to efficiently parse real diagrams of…

cmp-lg · Computer Science 2008-02-03 Robert P. Futrelle , Nikos Nikolakis

Pattern matching is the most central task for text indices. Most recent indices leverage compression techniques to make pattern matching feasible for massive but highly-compressible datasets. Within this kind of indices, we propose a new…

Data Structures and Algorithms · Computer Science 2021-05-31 Tooru Akagi , Dominik Köppl , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

We introduce a data structure for counting pattern occurrences in texts compressed with any run-length context-free grammar. Our structure uses space proportional to the grammar size and counts the occurrences of a pattern of length $m$ in…

Data Structures and Algorithms · Computer Science 2025-01-30 Gonzalo Navarro , Alejandro Pacheco

Sentences are important semantic units of natural language. A generic, distributional representation of sentences that can capture the latent semantics is beneficial to multiple downstream applications. We observe a simple geometry of…

Computation and Language · Computer Science 2017-04-19 Jiaqi Mu , Suma Bhat , Pramod Viswanath
‹ Prev 1 2 3 10 Next ›