Related papers: Learning Regular Languages over Large Ordered Alph…
In this paper, we present a categorical approach to learning automata over words, in the sense of the $L^*$-algorithm of Angluin. This yields a new generic $L^*$-like algorithm which can be instantiated for learning deterministic automata,…
We investigate a learning algorithm in the context of nominal automata, an extension of classical automata to alphabets featuring names. This class of automata captures nominal regular languages; analogously to the classical language…
We propose a generic categorical framework for learning unknown formal languages of various types (e.g. finite or infinite words, weighted and nominal languages). Our approach is parametric in a monad T that represents the given type of…
Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the $L^*$ algorithm, introduced by Angluin. The $L^*$ algorithm learns deterministic finite-state automata (DFAs)…
We present an Angluin-style algorithm to learn nominal automata, which are acceptors of languages over infinite (structured) alphabets. The abstract approach we take allows us to seamlessly extend known variations of the algorithm to this…
Let L be an infinite regular language on a totally ordered alphabet (A,<). Feeding a finite deterministic automaton (with output) with the words of L enumerated lexicographically with respect to < leads to an infinite sequence over the…
We give a new proof of a result from well quasi-order theory on the computability of bases for upwards-closed sets of words. This new proof is based on Angluin's L* algorithm, that learns an automaton from a minimally adequate teacher. This…
Generalizations of linear numeration systems in which the set of natural numbers is recognizable by finite automata are obtained by describing an arbitrary infinite regular language following the lexicographic ordering. For these systems of…
Extracting finite state automata (FSAs) from black-box models offers a powerful approach to gaining interpretable insights into complex model behaviors. To support this pursuit, we present a weighted variant of Angluin's (1987)…
This paper examines the characterization and learning of grammars defined with enriched representational models. Model-theoretic approaches to formal language theory traditionally assume that each position in a string belongs to exactly one…
Large Language Models (LLMs) have recently developed new advanced functionalities. Their effectiveness relies on statistical learning and generalization capabilities. However, they face limitations in internalizing the data they process and…
We define a class of languages of infinite words over infinite alphabets, and the corresponding automata. The automata used for recognition are a generalisation of deterministic Muller automata to the setting of nominal sets. Remarkably,…
We present a new learning algorithm for realtime one-counter automata. Our algorithm uses membership and equivalence queries as in Angluin's L* algorithm, as well as counter value queries and partial equivalence queries. In a partial…
What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way of addressing the above question is to formalize it as a matter of learnability of classes of…
We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L* algorithm as a learner and the trained RNN…
Since the seminal work by Angluin and the introduction of the L*-algorithm, active learning of automata by membership and equivalence queries has been extensively studied to learn various extensions of automata. For weighted automata,…
Ontologies are useful for automatic machine processing of domain knowledge as they represent it in a structured format. Yet, constructing ontologies requires substantial manual effort. To automate part of this process, large language models…
Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic…
Many methods for the verification of complex computer systems require the existence of a tractable mathematical abstraction of the system, often in the form of an automaton. In reality, however, such a model is hard to come up with, in…
Artificial intelligence is making spectacular progress, and one of the best examples is the development of large language models (LLMs) such as OpenAI's GPT series. In these lectures, written for readers with a background in mathematics or…