Related papers: Fast Algorithm for Partial Covers in Words
A word $u=u_1\dots u_n$ is a scattered factor of a word $w$ if $u$ can be obtained from $w$ by deleting some of its letters: there exist the (potentially empty) words $v_0,v_1,..,v_n$ such that $w = v_0u_1v_1...u_nv_n$. The set of all…
Starting in the 1970s with the fundamental work of Imre Simon, \emph{scattered factors} (also known as subsequences or scattered subwords) have remained a consistently and heavily studied object. The majority of work on scattered factors…
A border u of a word w is a proper factor of w occurring both as a prefix and as a suffix. The maximal unbordered factor of w is the longest factor of w which does not have a border. Here an O(n log n)-time with high probability (or O(n log…
For $\alpha\geq 1$, an $\alpha$-gapped repeat in a word $w$ is a factor $uvu$ of $w$ such that $|uv|\leq \alpha |u|$; the two factors $u$ in such a repeat are called arms, while the factor $v$ is called gap. Such a repeat is called maximal…
A reconstruction problem of words from scattered factors asks for the minimal information, like multisets of scattered factors of a given length or the number of occurrences of scattered factors from a given set, necessary to uniquely…
A gapped repeat is a factor of the form $uvu$ where $u$ and $v$ are nonempty words. The period of the gapped repeat is defined as $|u|+|v|$. The gapped repeat is maximal if it cannot be extended to the left or to the right by at least one…
We study word reconstruction problems. Improving a previous result by P. Fleischmann, M. Lejeune, F. Manea, D. Nowotka and M. Rigo, we prove that, for any unknown word $w$ of length $n$ over an alphabet of cardinality $k$, $w$ can be…
Partial words are sequences over a finite alphabet that may contain wildcard symbols, called holes, which match or are compatible with all letters; partial words without holes are said to be full words (or simply words). Given an infinite…
For $0<\delta <1$ a $\delta$-subrepetition in a word is a factor which exponent is less than~2 but is not less than $1+\delta$ (the exponent of the factor is the ratio of the factor length to its minimal period). The $\delta$-subrepetition…
The binomial notation (w u) represents the number of occurrences of the word u as a (scattered) subword in w. We first introduce and study possible uses of a geometrical interpretation of (w ab) and (w ba) when a and b are distinct letters.…
An absent factor of a string $w$ is a string $u$ which does not occur as a contiguous substring (a.k.a. factor) inside $w$. We extend this well-studied notion and define absent subsequences: a string $u$ is an absent subsequence of a string…
We consider the problem of computing a shortest solid cover of an indeterminate string. An indeterminate string may contain non-solid symbols, each of which specifies a subset of the alphabet that could be present at the corresponding…
A closed word (a.k.a. periodic-like word or complete first return) is a word whose longest border does not have internal occurrences, or, equivalently, whose longest repeated prefix is not right special. We investigate the structure of…
We improve the running times of $O(1)$-approximation algorithms for the set cover problem in geometric settings, specifically, covering points by disks in the plane, or covering points by halfspaces in three dimensions. In the unweighted…
Given a finite alphabet $\Sigma$ and a right-infinite word $w$ over the alphabet $\Sigma$, we construct a topological space ${\rm Rec}(w)$ consisting of all right-infinite recurrent words whose factors are all factors of $w$, where we work…
We introduce subsequence covers (s-covers, in short), a new type of covers of a word. A word $C$ is an s-cover of a word $S$ if the occurrences of $C$ in $S$ as subsequences cover all the positions in $S$. The s-covers seem to be…
Given a finite alphabet $\Sigma$ and a right-infinite word $\bf w$ over $\Sigma$, we define the Lie complexity function $L_{\bf w}:\mathbb{N}\to \mathbb{N}$, whose value at $n$ is the number of conjugacy classes (under cyclic shift) of…
Scattered factor (circular) universality was firstly introduced by Barker et al. in 2020. A word $w$ is called $k$-universal for some natural number $k$, if every word of length $k$ of $w$'s alphabet occurs as a scattered factor in $w$; it…
The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word $w$ in a given sequence $x$ can be used for classifying $w$ as avoided or overabundant. The definitions used for the expectation…
An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their…