Related papers: Regular Languages meet Prefix Sorting
Ordering the collection of states of a given automaton starting from an order of the underlying alphabet is a natural move towards a computational treatment of the language accepted by the automaton. Along this path, Wheeler \emph{graphs}…
Wheeler nondeterministic finite automata (WNFAs) were introduced as a generalization of prefix sorting from strings to labeled graphs. WNFAs admit optimal solutions to classic hard problems on labeled graphs and languages. The problem of…
Given an order of the underlying alphabet we can lift it to the states of a finite deterministic automaton: to compare states we use the order of the strings reaching them. When the order on strings is the co-lexicographic one \emph{and}…
Wheeler automata were introduced in 2017 as a tool to generalize existing indexing and compression techniques based on the Burrows-Wheeler transform. Intuitively, an automaton is said to be Wheeler if there exists a total order on its…
A Wheeler automaton is a finite state automaton whose states admit a total Wheeler order, reflecting the co-lexicographic order of the strings labeling source-to-node paths. A Wheeler language is a regular language admitting an accepting…
Sorting is a fundamental algorithmic pre-processing technique which often allows to represent data more compactly and, at the same time, speeds up search queries on it. In this paper, we focus on the well-studied problem of sorting and…
Recently, a new paradigm was introduced in automata theory. The main idea is to classify regular languages according to their propensity to be sorted, establishing a deep connection between automata theory and data compression [J. ACM…
The notion of Wheeler languages is rooted in the Burrows-Wheeler transform (BWT), one of the most central concepts in data compression and indexing. The BWT has been generalized to finite automata, the so-called Wheeler automata, by Gagie…
Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is playing an important role in the emerging field of compressed data structures: as opposed to general automata, WDFAs can be stored in just $\log\sigma + O(1)$ bits per…
An index for a finite automaton is a powerful data structure that supports locating paths labeled with a query pattern, thus solving pattern matching on the underlying regular language. In this paper, we solve the long-standing problem of…
In the present work, we tackle the regular language indexing problem by first studying the hierarchy of $p$-sortable languages: regular languages accepted by automata of width $p$. We show that the hierarchy is strict and does not collapse,…
Co-lex partial orders were recently introduced in (Cotumaccio et al., SODA 2021 and JACM 2023) as a powerful tool to index finite state automata, with applications to regular expression matching. They generalize Wheeler orders (Gagie et…
Analogous to regular string and tree languages, regular languages of directed acyclic graphs (DAGs) are defined in the literature. Although called regular, those DAG-languages are more powerful and, consequently, standard problems have a…
Families of DFAs (FDFAs) have recently been introduced as a new representation of $\omega$-regular languages. They target ultimately periodic words, with acceptors revolving around accepting some representation $u\cdot v^\omega$. Three…
The recently introduced class of Wheeler graphs, inspired by the Burrows-Wheeler Transform (BWT) of a given string, admits an efficient index data structure for searching for subpaths with a given path label, and lifts the applicability of…
The set of finite words over a well-quasi-ordered set is itself well-quasi-ordered. This seminal result by Higman is a cornerstone of the theory of well-quasi-orderings and has found numerous applications in computer science. However, this…
The states of a deterministic finite automaton A can be identified with collections of words in Pf(L(A)) -- the set of prefixes of words belonging to the regular language accepted by A. But words can be ordered and among the many possible…
In the present work, we lay out a new theory showing that all automata can always be co-lexicographically partially ordered, and an intrinsic measure of their complexity can be defined and effectively determined, namely, the minimum width…
Given a regular language L over an ordered alphabet $\Sigma$, the set of lexicographically smallest (resp., largest) words of each length is itself regular. Moreover, there exists an unambiguous finite-state transducer that, on a given word…
We introduce deterministic suffix-reading automata (DSA), a new automaton model over finite words. Transitions in a DSA are labeled with words. From a state, a DSA triggers an outgoing transition on seeing a word ending with the…