Related papers: Efficient Index for Weighted Sequences
In a \emph{weighted sequence}, for every position of the sequence and every letter of the alphabet a probability of occurrence of this letter at this position is specified. Weighted sequences are commonly used to represent imprecise or…
A weighted string over an alphabet of size $\sigma$ is a string in which a set of letters may occur at each position with respective occurrence probabilities. Weighted strings, also known as position weight matrices or uncertain sequences,…
Strings in the real world are often encoded with some level of uncertainty. In the character-level uncertainty model, an uncertain string $X$ of length $n$ on an alphabet $\Sigma$ is a sequence of $n$ probability distributions over…
The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…
A weighted string, also known as a position weight matrix, is a sequence of probability distributions over some alphabet. We revisit the Weighted Shortest Common Supersequence (WSCS) problem, introduced by Amir et al. [SPIRE 2011], that is,…
In this paper we describe a data structure that supports pattern matching queries on a dynamically arriving text over an alphabet ofconstant size. Each new symbol can be prepended to $T$ in O(1) worst-case time. At any moment, we can report…
Given a pattern string $P$ of length $n$ and a query string $T$ of length $m$, where the characters of $P$ and $T$ are drawn from an alphabet of size $\Delta$, the {\em exact string matching} problem consists of finding all occurrences of…
Given a pattern $P$ and a text $T$, both strings over a binary alphabet, the binary jumbled string matching problem consists in telling whether any permutation of $P$ occurs in $T$. The indexed version of this problem, i.e., preprocessing a…
String matching is the problem of finding all the substrings of a text which match a given pattern. It is one of the most investigated problems in computer science, mainly due to its very diverse applications in several fields. Recently,…
Computing the {\em matching statistics} of a string $P[1..m]$ with respect to a text $T[1..n]$ is a fundamental problem which has application to genome sequence comparison. In this paper, we study the problem of computing the matching…
Given a string $S$ over an alphabet $\Sigma$, the 'string indexing problem' is to preprocess $S$ to subsequently support efficient pattern matching queries, i.e., given a pattern string $P$ report all the occurrences of $P$ in $S$. In this…
In this paper, we describe a new type of match between a pattern and a text that aren't necessarily maximal in the query, but still contain useful matching information: locally maximal exact matches (LEMs). There are usually a large amount…
Online string matching is a computational problem involving the search for patterns or substrings in a large text dataset, with the pattern and text being processed sequentially, without prior access to the entire text. Its relevance stems…
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…
This paper addresses the online exact string matching problem which consists in finding all occurrences of a given pattern p in a text t. It is an extensively studied problem in computer science, mainly due to its direct applications to…
The research on indexing repetitive string collections has focused on the same search problems used for regular string collections, though they can make little sense in this scenario. For example, the basic pattern matching query "list all…
A pattern p (i.e., a string of variables and terminals) matches a word w, if w can be obtained by uniformly replacing the variables of p by terminal words. The respective matching problem, i.e., deciding whether or not a given pattern…
The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient pattern matching queries. Typical queries include existential queries (decide if the pattern occurs in S), reporting…
In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to…
Pattern matching is the most central task for text indices. Most recent indices leverage compression techniques to make pattern matching feasible for massive but highly-compressible datasets. Within this kind of indices, we propose a new…