Related papers: Indexes in Microsoft SQL Server

Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words

Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.

Information Retrieval · Computer Science 2018-11-27 A. B. Veretennikov

Techniques for Inverted Index Compression

The data structure at the core of large-scale search engines is the inverted index, which is essentially a collection of sorted integer sequences called inverted lists. Because of the many documents indexed by such engines and stringent…

Information Retrieval · Computer Science 2022-02-08 Giulio Ermanno Pibiri , Rossano Venturini

Cortex: Harnessing Correlations to Boost Query Performance

Databases employ indexes to filter out irrelevant records, which reduces scan overhead and speeds up query execution. However, this optimization is only available to queries that filter on the indexed attribute. To extend these speedups to…

Databases · Computer Science 2020-12-15 Vikram Nathan , Jialin Ding , Tim Kraska , Mohammad Alizadeh

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval

Inverted file structure is a common technique for accelerating dense retrieval. It clusters documents based on their embeddings; during searching, it probes nearby clusters w.r.t. an input query and only evaluates documents within them by…

Information Retrieval · Computer Science 2023-10-18 Peitian Zhang , Zheng Liu , Shitao Xiao , Zhicheng Dou , Jing Yao

Compressed bitmap indexes: beyond unions and intersections

Compressed bitmap indexes are used to speed up simple aggregate queries in databases. Indeed, set operations like intersections, unions and complements can be represented as logical operations (AND,OR,NOT) that are ideally suited for…

Databases · Computer Science 2016-01-11 Owen Kaser , Daniel Lemire

The Case for Learned Index Structures

Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model…

Databases · Computer Science 2018-05-01 Tim Kraska , Alex Beutel , Ed H. Chi , Jeffrey Dean , Neoklis Polyzotis

Hippo: A Fast, yet Scalable, Database Indexing Approach

Even though existing database indexes (e.g., B+-Tree) speed up the query execution, they suffer from two main drawbacks: (1) A database index usually yields 5% to 15% additional storage overhead which results in non-ignorable dollar cost in…

Databases · Computer Science 2016-04-13 Jia Yu , Mohamed Sarwat

A Learned Index for Exact Similarity Search in Metric Spaces

Indexing is an effective way to support efficient query processing in large databases. Recently the concept of learned index, which replaces or complements traditional index structures with machine learning models, has been actively…

Databases · Computer Science 2022-08-01 Yao Tian , Tingyun Yan , Xi Zhao , Kai Huang , Xiaofang Zhou

Reordering Columns for Smaller Indexes

Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right…

Databases · Computer Science 2015-03-13 Daniel Lemire , Owen Kaser

Tree Index: A New Cluster Evaluation Technique

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation…

Machine Learning · Computer Science 2020-03-25 A. H. Beg , Md Zahidul Islam , Vladimir Estivill-Castro

Faster Exact Search using Document Clustering

We show how full-text search based on inverted indices can be accelerated by clustering the documents without losing results (SeCluD -- SEarch with CLUstered Documents). We develop a fast multilevel clustering algorithm that explicitly uses…

Information Retrieval · Computer Science 2014-11-06 Jonathan Dimond , Peter Sanders

The Potential of Learned Index Structures for Index Compression

Inverted indexes are vital in providing fast key-word-based search. For every term in the document collection, a list of identifiers of documents in which the term appears is stored, along with auxiliary information such as term frequency,…

Information Retrieval · Computer Science 2019-01-30 Harrie Oosterhuis , J. Shane Culpepper , Maarten de Rijke

Anytime Ranking on Document-Ordered Indexes

Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are…

Information Retrieval · Computer Science 2021-06-14 Joel Mackenzie , Matthias Petri , Alistair Moffat

Document clustering with evolved multiword search queries

Text clustering holds significant value across various domains due to its ability to identify patterns and group related information. Current approaches which rely heavily on a computed similarity measure between documents are often limited…

Information Retrieval · Computer Science 2025-04-09 Laurence Hirsch , Robin Hirsch , Bayode Ogunleye

An Improved Algorithm for Fast K-Word Proximity Search Based on Multi-Component Key Indexes

A search query consists of several words. In a proximity full-text search, we want to find documents that contain these words near each other. This task requires much time when the query consists of high-frequently occurring words. If we…

Information Retrieval · Computer Science 2020-09-08 Alexander B. Veretennikov

Proposing Cluster_Similarity Method in Order to Find as Much Better Similarities in Databases

Different ways of entering data into databases result in duplicate records that cause increasing of databases' size. This is a fact that we cannot ignore it easily. There are several methods that are used for this purpose. In this paper, we…

Databases · Computer Science 2011-12-15 Mohammad-Reza Feizi-Derakhshi , Azade Roohany

Suffix arrays with a twist

The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show…

Data Structures and Algorithms · Computer Science 2016-07-28 Tomasz Kowalski , Szymon Grabowski , Kimmo Fredriksson , Marcin Raniszewski

A Novel Approach for Web Page Set Mining

The one of the most time consuming steps for association rule mining is the computation of the frequency of the occurrences of itemsets in the database. The hash table index approach converts a transaction database to an hash index tree by…

Databases · Computer Science 2011-11-14 R. B. Geeta , Omkar Mamillapalli , Shasikumar G. Totad , Prasad Reddy P. V. G. D

MINT: Multi-Vector Search Index Tuning

Vector search plays a crucial role in many real-world applications. In addition to single-vector search, multi-vector search becomes important for multi-modal and multi-feature scenarios today. In a multi-vector database, each row is an…

Databases · Computer Science 2026-05-05 Jiongli Zhu , Yue Wang , Bailu Ding , Philip A. Bernstein , Vivek Narasayya , Surajit Chaudhuri

Fast, Incremental Inverted Indexing in Main Memory for Web-Scale Collections

For text retrieval systems, the assumption that all data structures reside in main memory is increasingly common. In this context, we present a novel incremental inverted indexing algorithm for web-scale collections that directly constructs…

Information Retrieval · Computer Science 2013-05-06 Nima Asadi , Jimmy Lin