Related papers: Pattern-based Subterm Selection in Isabelle
Simple type theory is formulated for use with the generic theorem prover Isabelle. This requires explicit type inference rules. There are function, product, and subset types, which may be empty. Descriptions (the eta-operator) introduce the…
We introduce a theorem proving approach to the specification and generation of temporal logical constraints for training neural networks. We formalise a deep embedding of linear temporal logic over finite traces (LTL$_f$) and an associated…
An automatic word classification system has been designed which processes word unigram and bigram frequency statistics extracted from a corpus of natural language utterances. The system implements a binary top-down form of word clustering…
(Part of the abstract) In this thesis, we investigate the use of unsupervised spoken term discovery in tackling this problem. Unsupervised spoken term discovery aims to discover topic-related terminologies in a speech without knowing the…
We significantly improve the performance of the E automated theorem prover on the Isabelle Sledgehammer problems by combining learning and theorem proving in several ways. In particular, we develop targeted versions of the ENIGMA guidance…
Words in some natural languages can have a composite structure. Elements of this structure include the root (that could also be composite), prefixes and suffixes with which various nuances and relations to other words can be expressed.…
In this paper, we propose a pattern-based term extraction approach for Japanese, applying ACABIT system originally developed for French. The proposed approach evaluates termhood using morphological patterns of basic terms and term variants.…
Isabelle is a generic theorem prover, designed for interactive reasoning in a variety of formal theories. At present it provides useful proof procedures for Constructive Type Theory, various first-order logics, Zermelo-Fraenkel set theory,…
Subsequence-based time series classification algorithms provide accurate and interpretable models, but training these models is extremely computation intensive. The asymptotic time complexity of subsequence-based algorithms remains a…
In this article we investigate how a subterm pattern matching algorithm can be exploited to implement efficient term rewriting procedures. From the left-hand sides of the rewrite system we construct a set automaton, which can be used to…
Probabilistic programs provide an expressive representation language for generative models. Given a probabilistic program, we are interested in the task of posterior inference: estimating a latent variable given a set of observed variables.…
Controlling the syntactic structure of text generated by language models is valuable for applications requiring clarity, stylistic consistency, or interpretability, yet it remains a challenging task. In this paper, we argue that sampling…
Natural Language Generation systems typically have two parts - strategic ('what to say') and tactical ('how to say'). We present our experiments in building an unsupervised corpus-driven template based tactical NLG system. We consider…
We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization…
We present a second-order language that can be used to succinctly specify ontologies in a consistent and transparent manner. This language is based on ontology templates (OTTR), a framework for capturing recurring patterns of axioms in…
The introduction of pattern languages in the seminal work [Angluin, ``Finding Patterns Common to a Set of Strings'', JCSS 1980] has revived the classical model of inductive inference (learning in the limit, gold-style learning). In…
Stemming is the process of reducing related words to a standard form by removing affixes from them. Existing algorithms vary with respect to their complexity, configurability, handling of unknown words, and ability to avoid under- and…
Term-based sparse representations dominate the first-stage text retrieval in industrial applications, due to its advantage in efficiency, interpretability, and exact term matching. In this paper, we study the problem of transferring the…
Monte Carlo methods can provide accurate p-value estimates of word counting test statistics and are easy to implement. They are especially attractive when an asymptotic theory is absent or when either the search sequence or the word pattern…
The paper presents a data-driven approach to information extraction (viewed as template filling) using the structured language model (SLM) as a statistical parser. The task of template filling is cast as constrained parsing using the SLM.…