Related papers: Learning rational stochastic languages
In probabilistic grammatical inference, a usual goal is to infer a good approximation of an unknown distribution P called a stochastic language. The estimate of P stands in some class of probabilistic models such as probabilistic automata…
The goal of the present paper is to provide a systematic and comprehensive study of rational stochastic languages over a semiring K \in {Q, Q +, R, R+}. A rational stochastic language is a probability distribution over a free monoid…
Stochastic languages are the languages recognized by probabilistic finite automata (PFAs) with cutpoint over the field of real numbers. More general computational models over the same field such as generalized finite automata (GFAs) and…
When does a deterministic computational model define a probability distribution? What are its properties? This work formalises and settles this stochasticity problem for weighted automata, and its generalisation cost register automata…
The Rational Speech Acts (RSA) model treats language use as a recursive process in which probabilistic speaker and listener agents reason about each other's intentions to enrich the literal semantics of their language along broadly Gricean…
We present probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two different algorithms to exactly calculate the distribution of the results obtained by such…
Probabilistic programs encode stochastic models as ordinary-looking programs with primitives for sampling numbers from predefined distributions and conditioning. Their applications include, among many others, machine learning and modeling…
A hallmark of human language is the ability to effectively and efficiently convey contextually relevant information. One theory for how humans reason about language is presented in the Rational Speech Acts (RSA) framework, which captures…
Determining whether an unknown distribution matches a known reference is a cornerstone problem in distributional analysis. While classical results establish a rigorous framework in the case of distributions over finite domains, real-world…
Let $\mathcal{P}(\Sigma^*)$ be the semiring of languages, and consider its subset $\mathcal{P}(\Sigma)$. In this paper we define the language recognized by a weighted automaton over $\mathcal{P}(\Sigma)$ and a one-letter alphabet.…
Synchronous languages are now a standard industry tool for critical embedded systems. Designers write high-level specifications by composing streams of values using block diagrams. These languages have been extended with Bayesian reasoning…
The article defines and studies the genus of finite state deterministic automata (FSA) and regular languages. Indeed, a FSA can be seen as a graph for which the notion of genus arises. At the same time, a FSA has a semantics via its…
Today's probabilistic language generators fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics, e.g., perplexity. This discrepancy has puzzled the…
Language models are essentially probability distributions over token sequences. Auto-regressive models generate sentences by iteratively computing and sampling from the distribution of the next token. This iterative sampling introduces…
Selective rationalization aims to produce decisions along with rationales (e.g., text highlights or word alignments between two sentences). Commonly, rationales are modeled as stochastic binary masks, requiring sampling-based gradient…
Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are…
Regular expressions in an Automata Theory and Formal Languages course are mostly treated as a theoretical topic. That is, to some degree their mathematical properties and their role to describe languages is discussed. This approach fails to…
Random experiments that are simple and clear enough to be performed by human agents feature prominently in the teaching of elementary stochastics as well as in games. We present Alea, a domain-specific language for the specification of…
Inspired by distributed algorithms, we introduce a new class of finite graph automata that recognize precisely the graph languages definable in monadic second-order logic. For the cases of words and trees, it has been long known that the…
Stochastic discriminative EM (sdEM) is an online-EM-type algorithm for discriminative training of probabilistic generative models belonging to the exponential family. In this work, we introduce and justify this algorithm as a stochastic…