Related papers: Practical Entropy-Compressed Rank/Select Dictionar…

Rank and select: Another lesson learned

Rank and select queries on bitmaps are essential building bricks of many compressed data structures, including text indexes, membership and range supporting spatial data structures, compressed graphs, and more. Theoretically considered yet…

Data Structures and Algorithms · Computer Science 2016-05-13 Szymon Grabowski , Marcin Raniszewski

Succinct Choice Dictionaries

The choice dictionary is introduced as a data structure that can be initialized with a parameter $n\in\mathbb{N}=\{1,2,\ldots\}$ and subsequently maintains an initially empty subset $S$ of $\{1,\ldots,n\}$ under insertion, deletion,…

Data Structures and Algorithms · Computer Science 2017-03-17 Torben Hagerup , Frank Kammer

Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors

Bit vectors are fundamental building blocks of many succinct data structures. They can be used to represent graphs, are an important part of many text indices in the form of the wavelet tree, and can be used to encode ordered sequences of…

Data Structures and Algorithms · Computer Science 2022-11-08 Florian Kurpicz

Grammar Compressed Sequences with Rank/Select Support

Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. Several recent applications need to represent highly…

Data Structures and Algorithms · Computer Science 2019-11-25 Alberto Ordóñez , Gonzalo Navarro , Nieves R. Brisaboa

Rank, select and access in grammar-compressed strings

Given a string $S$ of length $N$ on a fixed alphabet of $\sigma$ symbols, a grammar compressor produces a context-free grammar $G$ of size $n$ that generates $S$ and only $S$. In this paper we describe data structures to support the…

Data Structures and Algorithms · Computer Science 2014-08-15 Djamal Belazzougui , Simon J. Puglisi , Yasuo Tabei

Engineering Rank/Select Data Structures for Large-Alphabet Strings

Large-alphabet strings are common in scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in…

Data Structures and Algorithms · Computer Science 2024-05-03 Diego Arroyuelo , Gabriel Carmona , Héctor Larrañaga , Francisco Riveros , Carlos Eugenio Rojas-Morales , Erick Sepúlveda

Proving tree algorithms for succinct data structures

Succinct data structures give space-efficient representations of large amounts of data without sacrificing performance. They rely one cleverly designed data representations and algorithms. We present here the formalization in Coq/SSReflect…

Programming Languages · Computer Science 2019-07-03 Reynald Affeldt , Jacques Garrigue , Xuanrui Qi , Kazunari Tanaka

Theory Meets Practice for Bit Vectors Supporting Rank and Select

Bit vectors with support for fast rank and select are a fundamental building block for compressed data structures. We close a gap between theory and practice by analyzing an important part of the design space and experimentally evaluating a…

Data Structures and Algorithms · Computer Science 2025-09-23 Florian Kurpicz , Niccolò Rigi-Luperti , Peter Sanders

An Optimal Choice Dictionary

A choice dictionary is a data structure that can be initialized with a parameter $n\in\{1,2,\ldots\}$ and subsequently maintains an initially empty subset $S$ of $\{1,\ldots,n\}$ under insertion, deletion, membership queries and an…

Data Structures and Algorithms · Computer Science 2017-11-03 Torben Hagerup

Range (R\'enyi) Entropy Queries and Partitioning

Data partitioning that maximizes/minimizes the Shannon entropy, or more generally the R\'enyi entropy is a crucial subroutine in data compression, columnar storage, and cardinality estimation algorithms. These partition algorithms can be…

Data Structures and Algorithms · Computer Science 2025-11-05 Aryan Esmailpour , Sanjay Krishnan , Stavros Sintos

Text Ranking and Classification using Data Compression

A well-known but rarely used approach to text categorization uses conditional entropy estimates computed using data compression tools. Text affinity scores derived from compressed sizes can be used for classification and ranking tasks, but…

Machine Learning · Computer Science 2021-12-08 Nitya Kasturi , Igor L. Markov

SYNTAX: A computer program to compress a sequence and to estimate its information content

The determination of block-entropies is a well established method for the investigation of discrete data, also called symbols (7). There is a large variety of such symbolic sequences, ranging from texts written in natural languages,…

Disordered Systems and Neural Networks · Physics 2007-05-23 Miguel Angel Jimenez-Montano , Werner Ebeling , Thorsten Poeschel

Optimal Lower and Upper Bounds for Representing Sequences

Sequence representations supporting queries $access$, $select$ and $rank$ are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how…

Data Structures and Algorithms · Computer Science 2013-08-26 Djamal Belazzougui , Gonzalo Navarro

Entropy Bounds for Grammar-Based Tree Compressors

The definition of $k^{th}$-order empirical entropy of strings is extended to node labelled binary trees. A suitable binary encoding of tree straight-line programs (that have been used for grammar-based tree compression before) is shown to…

Data Structures and Algorithms · Computer Science 2020-05-21 Danny Hucke , Markus Lohrey , Louisa Seelbach Benkner

Queries on LZ-Bounded Encodings

We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data…

Data Structures and Algorithms · Computer Science 2014-12-03 Djamal Belazzougui , Travis Gagie , Paweł Gawrychowski , Juha Kärkkäinen , Alberto Ordóñez , Simon J. Puglisi , Yasuo Tabei

Entropy bounds for grammar compression

Grammar compression represents a string as a context free grammar. Achieving compression requires encoding such grammar as a binary string; there are a few commonly used encodings. We bound the size of practically used encodings for several…

Data Structures and Algorithms · Computer Science 2020-05-21 Michał Gańczorz

Fast Prefix Search in Little Space, with Applications

It has been shown in the indexing literature that there is an essential difference between prefix/range searches on the one hand, and predecessor/rank searches on the other hand, in that the former provably allows faster query resolution.…

Data Structures and Algorithms · Computer Science 2018-04-16 Djamal Belazzougui , Paolo Boldi , Rasmus Pagh , Sebastiano Vigna

A Searchable Compressed Edit-Sensitive Parsing

Practical data structures for the edit-sensitive parsing (ESP) are proposed. Given a string S, its ESP tree is equivalent to a context-free grammar G generating just S, which is represented by a DAG. Using the succinct data structures for…

Data Structures and Algorithms · Computer Science 2015-03-17 Naoya Kishiue , Masaya Nakahara , Shirou Maruyama , Hiroshi Sakamoto

Structural Entropy Guided Probabilistic Coding

Probabilistic embeddings have several advantages over deterministic embeddings as they map each data point to a distribution, which better describes the uncertainty and complexity of data. Many works focus on adjusting the distribution…

Artificial Intelligence · Computer Science 2024-12-16 Xiang Huang , Hao Peng , Li Sun , Hui Lin , Chunyang Liu , Jiang Cao , Philip S. Yu

Structural-Entropy-Based Sample Selection for Efficient and Effective Learning

Sample selection improves the efficiency and effectiveness of machine learning models by providing informative and representative samples. Typically, samples can be modeled as a sample graph, where nodes are samples and edges represent…

Machine Learning · Computer Science 2025-03-04 Tianchi Xie , Jiangning Zhu , Guozu Ma , Minzhi Lin , Wei Chen , Weikai Yang , Shixia Liu