Related papers: Sparse Regular Expression Matching
Regular expressions constitute a fundamental notion in formal language theory and are frequently used in computer science to define search patterns. A classic algorithm for these problems constructs and simulates a non-deterministic finite…
An extended regular expression $R$ specifies a set of strings formed by characters from an alphabet combined with concatenation, union, intersection, complement, and star operators. Given an extended regular expression $R$ and a string $Q$,…
Given a regular expression $R$ and a string $Q$, the regular expression parsing problem is to determine if $Q$ matches $R$ and if so, determine how it matches, e.g., by a mapping of the characters of $Q$ to the characters in $R$. Regular…
In this paper we revisit the classical regular expression matching problem, namely, given a regular expression $R$ and a string $Q$, decide if $Q$ matches one of the strings specified by $R$. Let $m$ and $n$ be the length of $R$ and $Q$,…
Given a pattern string $P$ of length $n$ and a query string $T$ of length $m$, where the characters of $P$ and $T$ are drawn from an alphabet of size $\Delta$, the {\em exact string matching} problem consists of finding all occurrences of…
We study regular expression membership testing: Given a regular expression of size $m$ and a string of size $n$, decide whether the string is in the language described by the regular expression. Its classic $O(nm)$ algorithm is one of the…
The regular expression matching problem asks whether a given regular expression of length $m$ matches a given string of length $n$. As is well known, the problem can be solved in $O(nm)$ time using Thompson's algorithm. Moreover, recent…
Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…
Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations…
We study the basic regular expression intersection testing problem, which asks to determine whether the intersection of the languages of two regular expressions is nonempty. A textbook solution to this problem is to construct the…
Regular expressions with backreferences (regex, for short), as supported by most modern libraries for regular expression matching, have an NP-complete matching problem. We define a complexity parameter of regex, called active variable…
Given a pattern string $P$ of length $n$ consisting of $\delta$ distinct characters and a query string $T$ of length $m$, where the characters of $P$ and $T$ are drawn from an alphabet $\Sigma$ of size $\Delta$, the {\em exact string…
Given strings $P$ and $Q$ the (exact) string matching problem is to find all positions of substrings in $Q$ matching $P$. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear time…
SMORE (Chen et al., 2023) recently proposed the concept of semantic regular expressions that extend the classical formalism with a primitive to query external oracles such as databases and large language models (LLMs). Such patterns can be…
Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for…
We consider the stable approximation of sparse solutions to non-linear operator equations by means of Tikhonov regularization with a subquadratic penalty term. Imposing certain assumptions, which for a linear operator are equivalent to the…
It is well-known that checking whether a given string $w$ matches a given regular expression $r$ can be done in quadratic time $O(|w|\cdot |r|)$ and that this cannot be improved to a truly subquadratic running time of $O((|w|\cdot…
The currently fastest algorithm for regular expression pattern matching and membership improves the classical O(nm) time algorithm by a factor of about log^{3/2}n. Instead of focussing on general patterns we analyse homogeneous patterns of…
The circular dictionary matching problem is an extension of the classical dictionary matching problem where every string in the dictionary is interpreted as a circular string: after reading the last character of a string, we can move back…
We consider the problem of querying a string (or, a database) of length $N$ bits to determine all the locations where a substring (query) of length $M$ appears either exactly or is within a Hamming distance of $K$ from the query. We assume…