Related papers: Sequential-Access FM-Indexes

Acceleration of FM-index Queries Through Prefix-free Parsing

FM-indexes are a crucial data structure in DNA alignment, for example, but searching with them usually takes at least one random access per character in the query pattern. Ferragina and Fischer observed in 2007 that word-based indexes often…

Data Structures and Algorithms · Computer Science 2023-05-11 Aaron Hong , Marco Oliva , Dominik Köppl , Hideo Bannai , Christina Boucher , Travis Gagie

Fast construction of FM-index for long sequence reads

Summary: We present a new method to incrementally construct the FM-index for both short and long sequence reads, up to the size of a genome. It is the first algorithm that can build the index while implicitly sorting the sequences in the…

Genomics · Quantitative Biology 2014-08-29 Heng Li

A Compact Index for Order-Preserving Pattern Matching

Order-preserving pattern matching was introduced recently but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same…

Data Structures and Algorithms · Computer Science 2018-12-11 Gianni Decaroli , Travis Gagie , Giovanni Manzini

Large-Scale Pattern Search Using Reduced-Space On-Disk Suffix Arrays

The suffix array is an efficient data structure for in-memory pattern search. Suffix arrays can also be used for external-memory pattern search, via two-level structures that use an internal index to identify the correct block of suffix…

Data Structures and Algorithms · Computer Science 2013-03-27 Simon Gog , Alistair Moffat , J. Shane Culpepper , Andrew Turpin , Anthony Wirth

Efficient Retrieval of Similar Time Sequences Using DFT

We propose an improvement of the known DFT-based indexing technique for fast retrieval of similar time sequences. We use the last few Fourier coefficients in the distance computation without storing them in the index since every coefficient…

Databases · Computer Science 2007-05-23 Davood Rafiei , Alberto Mendelzon

About a structure of easily updatable full-text indexes

We consider strategies to organize easily updatable associative arrays in external memory. These arrays are used for full-text search. We study indexes with different keys: single word form, two word forms, and sequences of word forms. The…

Information Retrieval · Computer Science 2020-07-21 Alexander B. Veretennikov

FM-index for dummies

The FM-index is a celebrated compressed data structure for full-text pattern searching. After the first wave of interest in its theoretical developments, we can observe a surge of interest in practical FM-index variants in the last few…

Data Structures and Algorithms · Computer Science 2015-10-27 Szymon Grabowski , Marcin Raniszewski , Sebastian Deorowicz

Order-preserving factor analysis (OPFA)

We present a novel factor analysis method that can be applied to the discovery of common factors shared among trajectories in multivariate time series data. These factors satisfy a precedence-ordering property: certain factors are recruited…

Machine Learning · Statistics 2011-05-10 Arnau Tibau Puig , Alfred O. Hero

A Time Efficient Indexing Scheme for Complex Spatiotemporal Retrieval

The paper is concerned with the time efficient processing of spatiotemporal predicates, i.e. spatial predicates associated with an exact temporal constraint. A set of such predicates forms a buffer query or a Spatio-temporal Pattern (STP)…

Databases · Computer Science 2008-12-18 Lagogiannis George , Lorentzos Nikos , Sioutas Spyros , Theodoridis Evaggelos

Indexes in Microsoft SQL Server

Indexes are the best apposite choice for quickly retrieving the records. This is nothing but cutting down the number of Disk IO. Instead of scanning the complete table for the results, we can decrease the number of IO's or page fetches…

Databases · Computer Science 2019-03-21 Sourav Mukherjee

Cortex: Harnessing Correlations to Boost Query Performance

Databases employ indexes to filter out irrelevant records, which reduces scan overhead and speeds up query execution. However, this optimization is only available to queries that filter on the indexed attribute. To extend these speedups to…

Databases · Computer Science 2020-12-15 Vikram Nathan , Jialin Ding , Tim Kraska , Mohammad Alizadeh

Indexing Finite-State Automata Using Forward-Stable Partitions

An index on a finite-state automaton is a data structure able to locate specific patterns on the automaton's paths and consequently on the regular language accepted by the automaton itself. Cotumaccio and Prezza [SODA '21], introduced a…

Formal Languages and Automata Theory · Computer Science 2024-06-06 Ruben Becker , Sung-Hwan Kim , Nicola Prezza , Carlo Tosoni

Fast Query Processing by Distributing an Index over CPU Caches

Data intensive applications on clusters often require requests quickly be sent to the node managing the desired data. In many applications, one must look through a sorted tree structure to determine the responsible node for accessing or…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Xiaoqin Ma , Gene Cooperman

Database Theory in Action: Direct Access to Query Answers

Direct access asks for the retrieval of query answers by their ranked position, given a query and a desired order. While the time complexity of data structures supporting such accesses has been studied in depth, and efficient algorithms for…

Databases · Computer Science 2026-03-23 Jiayin Hu , Nikolaos Tziavelis

A bloated FM-index reducing the number of cache misses during the search

The FM-index is a well-known compressed full-text index, based on the Burrows-Wheeler transform (BWT). During a pattern search, the BWT sequence is accessed at "random" locations, which is cache-unfriendly. In this paper, we are interested…

Data Structures and Algorithms · Computer Science 2015-12-08 Szymon Grabowski , Aleksander Cisłak

Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words

Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.

Information Retrieval · Computer Science 2018-11-27 A. B. Veretennikov

Search on Secondary Attributes in Geo-Distributed Systems

In the age of big data, more and more applications need to query and analyse large volumes of continuously updated data in real-time. In response, cloud-scale storage systems can extend their interface that allows fast lookups on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-10 Dimitrios Vasilas

Fast Prefix Search in Little Space, with Applications

It has been shown in the indexing literature that there is an essential difference between prefix/range searches on the one hand, and predecessor/rank searches on the other hand, in that the former provably allows faster query resolution.…

Data Structures and Algorithms · Computer Science 2018-04-16 Djamal Belazzougui , Paolo Boldi , Rasmus Pagh , Sebastiano Vigna

Aggregation and Ordering in Factorised Databases

A common approach to data analysis involves understanding and manipulating succinct representations of data. In earlier work, we put forward a succinct representation system for relational data called factorised databases and reported on…

Databases · Computer Science 2013-07-02 Nurzhan Bakibayev , Tomáš Kočiský , Dan Olteanu , Jakub Závodný

Iterative Algorithm for Finding Frequent Patterns in Transactional Databases

A high-performance algorithm for searching for frequent patterns (FPs) in transactional databases is presented. The search for FPs is carried out by using an iterative sieve algorithm by computing the set of enclosed cycles. In each inner…

Databases · Computer Science 2007-05-23 Gennady P. Berman , Vyacheslav N. Gorshkov , Edward P. MacKerrow , Xidi Wang