Related papers: Parameterized DAWGs: efficient constructions and b…
Let $\Sigma$ and $\Pi$ be disjoint alphabets, respectively called the static alphabet and the parameterized alphabet. Two strings $x$ and $y$ over $\Sigma \cup \Pi$ of equal length are said to parameterized match (p-match) if there exists a…
A parameterized string (p-string) is a string over an alphabet $(\Sigma_{s} \cup \Sigma_{p})$, where $\Sigma_{s}$ and $\Sigma_{p}$ are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively.…
Let $\Sigma$ and $\Pi$ be disjoint alphabets of respective size $\sigma$ and $\pi$. Two strings over $\Sigma \cup \Pi$ of equal length are said to parameterized match (p-match) if there is a bijection $f:\Sigma \cup \Pi \rightarrow \Sigma…
The directed acyclic word graph (DAWG) of a string $y$ of length $n$ is the smallest (partial) DFA which recognizes all suffixes of $y$ with only $O(n)$ nodes and edges. In this paper, we show how to construct the DAWG for the input string…
We consider construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection $\mathcal{T}$ of texts, where a new symbol may be appended to any text in $\mathcal{T} = \{T_1, \ldots, T_K\}$,…
We deal with the problem of maintaining the suffix tree indexing structure for a fully-online collection of multiple strings, where a new character can be prepended to any string in the collection at any time. The only previously known…
Two strings of equal length are said to parameterized match if there is a bijection that maps the characters of one string to those of the other string, so that two strings become identical. The parameterized pattern matching problem is,…
The compact directed acyclic word graph (CDAWG) is the minimal compact automaton that recognizes all the suffixes of a string. Classically the CDAWG has been implemented as an index of the string it recognizes, requiring $o(n)$ space for a…
Parameterized strings are a generalization of strings in that their characters are drawn from two different alphabets, where one is considered to be the alphabet of static characters and the other to be the alphabet of parameter characters.…
Two strings are considered to have parameterized matching when there exists a bijection of the parameterized alphabet onto itself such that it transforms one string to another. Parameterized matching has application in software duplication…
Given a string $T$, it is known that its suffix tree can be represented using the compact directed acyclic word graph (CDAWG) with $e_T$ arcs, taking overall $O(e_T+e_{{\overline{T}}})$ words of space, where ${\overline{T}}$ is the reverse…
In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size $e$ for a text $T$ of length $n$ into other text…
A deterministic BSP algorithm for constructing the suffix array of a given string is presented, based on a technique which we call accelerated sampling. It runs in optimal O(n/p) local computation and communication, and requires a near…
We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings $A$ and $B$ is a compacted trie representing all suffixes in $A$ and $B$. It has…
We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be…
The parameterized matching problem is a variant of string matching, which is to search for all parameterized occurrences of a pattern $P$ in a text $T$. In considering matching algorithms, the combinatorial natures of strings, especially…
In the computational-mechanics structural analysis of one-dimensional cellular automata the following automata-theoretic analogue of the \emph{change-point problem} from time series analysis arises: \emph{Given a string $\sigma$ and a…
In this paper, we propose a new indexing structure for parameterized strings which we call PLSTs, by generalizing linear-size suffix tries for ordinary strings. Two parameterized strings are said to match if there is a bijection on the…
Recently, Cenzato et al.\ proposed a new text index, called the \emph{suffixient array}, which is a subset of the suffix array and supports locating a single pattern occurrence or finding its maximal exact matches (MEMs), assuming random…
The compact directed acyclic word graph (CDAWG) of a string $T$ is an index occupying $O(\mathsf{e})$ space, where $\mathsf{e}$ is the number of right extensions of maximal repeats in $T$. For highly repetitive datasets, the measure…