Related papers: Approximate String Matching using a Bidirectional …

Optimum Search Schemes for Approximate String Matching Using Bidirectional FM-Index

Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. Bidirectional indices have opened new possibilities in this regard allowing the…

Data Structures and Algorithms · Computer Science 2018-03-06 Kiavash Kianfar , Christopher Pockrandt , Bahman Torkamandi , Haochen Luo , Knut Reinert

Approximate String Matching: Theory and Applications (La Recherche Approch\'ee de Motifs : Th\'eorie et Applications)

The approximate string matching is a fundamental and recurrent problem that arises in most computer science fields. This problem can be defined as follows: Let $D=\{x_1,x_2,\ldots x_d\}$ be a set of $d$ words defined on an alphabet…

Data Structures and Algorithms · Computer Science 2017-01-31 Ibrahim Chegrane

Pattern Matching in Trees and Strings

We study the design of efficient algorithms for combinatorial pattern matching. More concretely, we study algorithms for tree matching, string matching, and string matching in compressed texts.

Data Structures and Algorithms · Computer Science 2007-09-03 Philip Bille

Good parts first - a new algorithm for approximate search in lexica and string databases

We present a new efficient method for approximate search in electronic lexica. Given an input string (the pattern) and a similarity threshold, the algorithm retrieves all entries of the lexicon that are sufficiently similar to the pattern.…

Computation and Language · Computer Science 2015-12-04 Stefan Gerdjikov , Stoyan Mihov , Petar Mitankin , Klaus U. Schulz

Investigative Simulation: Towards Utilizing Graph Pattern Matching for Investigative Search

This paper proposes the use of graph pattern matching for investigative graph search, which is the process of searching for and prioritizing persons of interest who may exhibit part or all of a pattern of suspicious behaviors or…

Social and Information Networks · Computer Science 2016-08-08 Benjamin W. K. Hung , Anura P. Jayasumana

Approximate textual retrieval

An approximate textual retrieval algorithm for searching sources with high levels of defects is presented. It considers splitting the words in a query into two overlapping segments and subsequently building composite regular expressions…

Information Retrieval · Computer Science 2007-05-23 Pere Constans

Learning as Search Optimization: Approximate Large Margin Methods for Structured Prediction

Mappings to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., linear chains) in which search and parameter estimation can be…

Machine Learning · Computer Science 2009-07-07 Hal Daumé , Daniel Marcu

Efficient Approximate Search for Sets of Vectors

We consider a similarity measure between two sets $A$ and $B$ of vectors, that balances the average and maximum cosine distance between pairs of vectors, one from set $A$ and one from set $B$. As a motivation for this measure, we present…

Data Structures and Algorithms · Computer Science 2021-08-31 Michael Leybovich , Oded Shmueli

Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes

Full-text search engines are important tools for information retrieval. Term proximity is an important factor in relevance score measurement. In a proximity full-text search, we assume that a relevant document contains query terms near each…

Information Retrieval · Computer Science 2018-11-20 Alexander B. Veretennikov

Average-Case Optimal Approximate Circular String Matching

Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that…

Data Structures and Algorithms · Computer Science 2016-04-26 Carl Barton , Costas S. Iliopoulos , Solon P. Pissis

Proximity full-text searches of frequently occurring words with a response time guarantee

Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For…

Information Retrieval · Computer Science 2020-09-09 Alexander B. Veretennikov

Optimal-Hash Exact String Matching Algorithms

String matching is the problem of finding all the occurrences of a pattern in a text. We propose improved versions of the fast family of string matching algorithms based on hashing $q$-grams. The improvement consists of considering minimal…

Data Structures and Algorithms · Computer Science 2023-03-13 Thierry Lecroq

Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance

Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For…

Information Retrieval · Computer Science 2019-07-11 Alexander B. Veretennikov

Finding approximate palindromes in strings

We introduce a novel definition of approximate palindromes in strings, and provide an algorithm to find all maximal approximate palindromes in a string with up to $k$ errors. Our definition is based on the usual edit operations of…

Data Structures and Algorithms · Computer Science 2007-05-23 A. H. L. Porto , V. C. Barbosa

About a structure of easily updatable full-text indexes

We consider strategies to organize easily updatable associative arrays in external memory. These arrays are used for full-text search. We study indexes with different keys: single word form, two word forms, and sequences of word forms. The…

Information Retrieval · Computer Science 2020-07-21 Alexander B. Veretennikov

Approximate Cluster-Based Sparse Document Retrieval with Segmented Maximum Term Weights

This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an…

Information Retrieval · Computer Science 2024-04-16 Yifan Qiao , Shanxiu He , Yingrui Yang , Parker Carlson , Tao Yang

A practical index for approximate dictionary matching with few mismatches

Approximate dictionary matching is a classic string matching problem (checking if a query string occurs in a collection of strings) with applications in, e.g., spellchecking, online catalogs, geolocation, and web searchers. We present a…

Data Structures and Algorithms · Computer Science 2016-02-15 Aleksander Cisłak , Szymon Grabowski

Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions

This paper proposes a new framework for providing approximation guarantees of local search algorithms. Local search is a basic algorithm design technique and is widely used for various combinatorial optimization problems. To analyze local…

Data Structures and Algorithms · Computer Science 2020-06-03 Kaito Fujii

Approximate Counting in Local Lemma Regimes

We establish efficient approximate counting algorithms for several natural problems in local lemma regimes. In particular, we consider the probability of intersection of events and the dimension of intersection of subspaces. Our approach is…

Data Structures and Algorithms · Computer Science 2025-12-12 Ryan L. Mann , Gabriel Waite

Anti-sparse coding for approximate nearest neighbor search

This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this…

Computer Vision and Pattern Recognition · Computer Science 2011-10-27 Hervé Jégou , Teddy Furon , Jean-Jacques Fuchs