Related papers: Data Structure Lower Bounds for Document Indexing …

Cross-Document Pattern Matching

We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern…

Data Structures and Algorithms · Computer Science 2012-06-21 Gregory Kucherov , Yakov Nekrich , Tatiana Starikovskaya

Lower Bounds for the Algorithmic Complexity of Learned Indexes

Learned index structures aim to accelerate queries by training machine learning models to approximate the rank function associated with a database attribute. While effective in practice, their theoretical limitations are not fully…

Data Structures and Algorithms · Computer Science 2026-01-13 Luis Alberto Croquevielle , Roman Sokolovskii , Thomas Heinis

Fast Set Intersection and Two Patterns Matching

In this paper we present a new problem, the fast set intersection problem, which is to preprocess a collection of sets in order to efficiently report the intersection of any two sets in the collection. In addition we suggest new solutions…

Data Structures and Algorithms · Computer Science 2010-03-12 Hagai Cohen , Ely Porat

Dynamic Data Structures for Document Collections and Graphs

In the dynamic indexing problem, we must maintain a changing collection of text documents so that we can efficiently support insertions, deletions, and pattern matching queries. We are especially interested in developing efficient data…

Data Structures and Algorithms · Computer Science 2015-03-23 J. Ian Munro , Yakov Nekrich , Jeffrey Scott Vitter

Upper and lower bounds for dynamic data structures on strings

We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give…

Data Structures and Algorithms · Computer Science 2018-02-20 Raphael Clifford , Allan Grønlund , Kasper Green Larsen , Tatiana Starikovskaya

Optimal Lower and Upper Bounds for Representing Sequences

Sequence representations supporting queries $access$, $select$ and $rank$ are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how…

Data Structures and Algorithms · Computer Science 2013-08-26 Djamal Belazzougui , Gonzalo Navarro

Symmetry, Outer Bounds, and Code Constructions: A Computer-Aided Investigation on the Fundamental Limits of Caching

We illustrate how computer-aided methods can be used to investigate the fundamental limits of the caching systems, which are significantly different from the conventional analytical approach usually seen in the information theory…

Information Theory · Computer Science 2018-08-28 Chao Tian

Practical Top-K Document Retrieval in Reduced Space

Supporting top-k document retrieval queries on general text databases, that is, finding the k documents where a given pattern occurs most frequently, has become a topic of interest with practical applications. While the problem has been…

Data Structures and Algorithms · Computer Science 2011-11-21 Gonzalo Navarro , Daniel Valenzuela

Lower bounds for text indexing with mismatches and differences

In this paper we study lower bounds for the fundamental problem of text indexing with mismatches and differences. In this problem we are given a long string of length $n$, the "text", and the task is to preprocess it into a data structure…

Data Structures and Algorithms · Computer Science 2018-12-24 Vincent Cohen-Addad , Laurent Feuilloley , Tatiana Starikovskaya

Engineering Small Space Dictionary Matching

The dictionary matching problem is to locate occurrences of any pattern among a set of patterns in a given text. Massive data sets abound and at the same time, there are many settings in which working space is extremely limited. We…

Data Structures and Algorithms · Computer Science 2013-01-29 Shoshana Marcus Dina Sokol

Efficient Hypergraph Pattern Matching via Match-and-Filter and Intersection Constraint

A hypergraph is a generalization of a graph, in which a hyperedge can connect multiple vertices, modeling complex relationships involving multiple vertices simultaneously. Hypergraph pattern matching, which is to find all isomorphic…

Databases · Computer Science 2025-12-23 Siwoo Song , Wonseok Shin , Kunsoo Park , Giuseppe F. Italiano , Zhengyi Yang , Wenjie Zhang

Structured Index Coding Problem and Multi-access Coded Caching

Index coding and coded caching are two active research topics in information theory with strong ties to each other. Motivated by the multi-access coded caching problem, we study a new class of structured index coding problems (ICPs) which…

Information Theory · Computer Science 2021-11-17 Kota Srinivas Reddy , Nikhil Karamchandani

Lower Bounds for Semi-adaptive Data Structures via Corruption

In a dynamic data structure problem we wish to maintain an encoding of some data in memory, in such a way that we may efficiently carry out a sequence of queries and updates to the data. A long-standing open problem in this area is to prove…

Computational Complexity · Computer Science 2020-10-05 Pavel Dvořák , Bruno Loff

Bloom maps

We consider the problem of succinctly encoding a static map to support approximate queries. We derive upper and lower bounds on the space requirements in terms of the error rate and the entropy of the distribution of values over keys: our…

Data Structures and Algorithms · Computer Science 2007-10-18 David Talbot , John Talbot

Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure

The prototypical high-dimensional statistics problem entails finding a structured signal in noise. Many of these problems exhibit an intriguing phenomenon: the amount of data needed by all known computationally efficient algorithms far…

Computational Complexity · Computer Science 2019-11-19 Matthew Brennan , Guy Bresler , Wasim Huleihel

Document Counting in Practice

We address the problem of counting the number of strings in a collection where a given pattern appears, which has applications in information retrieval and data mining. Existing solutions are in a theoretical stage. We implement these…

Data Structures and Algorithms · Computer Science 2015-10-02 Travis Gagie , Aleksi Hartikainen , Juha Kärkkäinen , Gonzalo Navarro , Simon J. Puglisi , Jouni Sirén

Mapping and Classifying Molecules from a High-Throughput Structural Database

High-throughput computational materials design promises to greatly accelerate the process of discovering new materials and compounds, and of optimizing their properties. The large databases of structures and properties that result from…

Chemical Physics · Physics 2016-11-22 Sandip De , Felix Musil , Teresa Ingram , Carsten Baldauf , Michele Ceriotti

A Compact Index for Order-Preserving Pattern Matching

Order-preserving pattern matching was introduced recently but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same…

Data Structures and Algorithms · Computer Science 2018-12-11 Gianni Decaroli , Travis Gagie , Giovanni Manzini

Query Lower Bounds for Correlation Clustering under Memory Constraints

This work initiates the study of memory-query tradeoffs for graph problems, with a focus on correlation clustering. Correlation clustering asks for a partition of the vertices that minimizes disagreements: non-edges inside clusters plus…

Computational Complexity · Computer Science 2026-05-25 Sumegha Garg , Songhua He , Periklis A. Papakonstantinou

Improved distance queries in planar graphs

There are several known data structures that answer distance queries between two arbitrary vertices in a planar graph. The tradeoff is among preprocessing time, storage space and query time. In this paper we present three data structures…

Data Structures and Algorithms · Computer Science 2011-02-23 Yahav Nussbaum