English
Related papers

Related papers: An efficient algorithm for three-component key ind…

200 papers

A search query consists of several words. In a proximity full-text search, we want to find documents that contain these words near each other. This task requires much time when the query consists of high-frequently occurring words. If we…

Information Retrieval · Computer Science 2020-09-08 Alexander B. Veretennikov

Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For…

Information Retrieval · Computer Science 2019-07-11 Alexander B. Veretennikov

Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For…

Information Retrieval · Computer Science 2020-09-09 Alexander B. Veretennikov

The problem of proximity full-text search is considered. If a search query contains high-frequently occurring words, then multi-component key indexes deliver an improvement in the search speed compared with ordinary inverted indexes. It was…

Information Retrieval · Computer Science 2021-08-03 Alexander B. Veretennikov

Proximity full-text search is commonly implemented in contemporary full-text search systems. Let us assume that the search query is a list of words. It is natural to consider a document as relevant if the queried words are near each other…

Information Retrieval · Computer Science 2021-01-12 Alexander B. Veretennikov

Full-text search engines are important tools for information retrieval. Term proximity is an important factor in relevance score measurement. In a proximity full-text search, we assume that a relevant document contains query terms near each…

Information Retrieval · Computer Science 2018-11-20 Alexander B. Veretennikov

We consider strategies to organize easily updatable associative arrays in external memory. These arrays are used for full-text search. We study indexes with different keys: single word form, two word forms, and sequences of word forms. The…

Information Retrieval · Computer Science 2020-07-21 Alexander B. Veretennikov

In a dynamic retrieval system, documents must be ingested as they arrive, and be immediately findable by queries. Our purpose in this paper is to describe an index structure and processing regime that accommodates that requirement for…

Information Retrieval · Computer Science 2023-01-12 Alistair Moffat , Joel Mackenzie

Text clustering holds significant value across various domains due to its ability to identify patterns and group related information. Current approaches which rely heavily on a computed similarity measure between documents are often limited…

Information Retrieval · Computer Science 2025-04-09 Laurence Hirsch , Robin Hirsch , Bayode Ogunleye

Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.

Information Retrieval · Computer Science 2018-11-27 A. B. Veretennikov

We show how full-text search based on inverted indices can be accelerated by clustering the documents without losing results (SeCluD -- SEarch with CLUstered Documents). We develop a fast multilevel clustering algorithm that explicitly uses…

Information Retrieval · Computer Science 2014-11-06 Jonathan Dimond , Peter Sanders

The data structure at the core of large-scale search engines is the inverted index, which is essentially a collection of sorted integer sequences called inverted lists. Because of the many documents indexed by such engines and stringent…

Information Retrieval · Computer Science 2022-02-08 Giulio Ermanno Pibiri , Rossano Venturini

Finding desired information from large data set is a difficult problem. Information retrieval is concerned with the structure, analysis, organization, storage, searching, and retrieval of information. Index is the main constituent of an IR…

Information Retrieval · Computer Science 2012-09-26 Md. Abdullah al Mamun , Md. Hanif , Md. Rakib Uddin , Tanvir Ahmed , Md. Mofizul Islam

Inverted indexes are vital in providing fast key-word-based search. For every term in the document collection, a list of identifiers of documents in which the term appears is stored, along with auxiliary information such as term frequency,…

Information Retrieval · Computer Science 2019-01-30 Harrie Oosterhuis , J. Shane Culpepper , Maarten de Rijke

Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are…

Information Retrieval · Computer Science 2021-06-14 Joel Mackenzie , Matthias Petri , Alistair Moffat

Inverted file structure is a common technique for accelerating dense retrieval. It clusters documents based on their embeddings; during searching, it probes nearby clusters w.r.t. an input query and only evaluates documents within them by…

Information Retrieval · Computer Science 2023-10-18 Peitian Zhang , Zheng Liu , Shitao Xiao , Zhicheng Dou , Jing Yao

Search engines rely heavily on term-based approaches that represent queries and documents as bags of words. Text---a document or a query---is represented by a bag of its words that ignores grammar and word order, but retains word frequency…

Information Retrieval · Computer Science 2017-11-17 Christophe Van Gysel

Querying over XML elements using keyword search is steadily gaining popularity. The traditional similarity measure is widely employed in order to effectively retrieve various XML documents. A number of authors have already proposed…

Information Retrieval · Computer Science 2010-12-20 Yang Wang , Zhikui Chen , Xiaodi Huang

A major difficulty in applying word vector embeddings in IR is in devising an effective and efficient strategy for obtaining representations of compound units of text, such as whole documents, (in comparison to the atomic words), for the…

Information Retrieval · Computer Science 2016-06-28 Dwaipayan Roy , Debasis Ganguly , Mandar Mitra , Gareth J. F. Jones

Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…

Information Retrieval · Computer Science 2010-03-11 Alok Ranjan , Harish Verma , Eatesh Kandpal , Joydip Dhar
‹ Prev 1 2 3 10 Next ›