Related papers: A Program That Simplifies Regular Expressions (Too…
I present the most fundamental features of an implemented system designed to manipulate representations of regular languages. The system is structured into two layers, allowing regular languages to be represented in an increasingly compact,…
We present a method to simplify expressions in the context of an equational theory. The basic ideas and concepts of the method have been presented previously elsewhere but here we tackle the difficult task of making it efficient in…
This article proposes a convenient tool for decoding the output of neural networks trained by Connectionist Temporal Classification (CTC) for handwritten text recognition. We use regular expressions to describe the complex structures…
Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…
Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able…
The paper presents a REDUCE program for the simplification of tensor expressions that are considered as formal indexed objects. The proposed algorithm is based on the consideration of tensor expressions as vectors in some linear space. This…
Here we define a new unification algorithm for terms interpreted in semantic domains denoted by a subclass of regular types here called deterministic regular types. This reflects our intention not to handle the semantic universe as a…
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that…
This paper presents a novel approach to automatically solving arithmetic word problems. This is the first algorithmic approach that can handle arithmetic problems with multiple steps and operations, without depending on additional…
The explosive rate of information growth and availability often makes it increasingly difficult to locate information pertinent to your needs. These problems are often compounded when keyword based search methodologies are not adequate for…
Information, stored or transmitted in digital form, is often structured. Individual data records are usually represented as hierarchies of their elements. Together, records form larger structures. Information processing applications have to…
Recent work exhibited that distributed word representations are good at capturing linguistic regularities in language. This allows vector-oriented reasoning based on simple linear algebra between words. Since many different methods have…
Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations…
We target the problem of provably computing the equivalence between two complex expression trees. To this end, we formalize the problem of equivalence between two such programs as finding a set of semantics-preserving rewrite rules from one…
An expression is any mathematical formula that contains certain formal variables and operations to be executed in a specified order. In computer science, it is usually convenient to represent each expression in the form of an expression…
Regular expressions are pervasive in modern systems. Many real-world regular expressions are inefficient, sometimes to the extent that they are vulnerable to complexity-based attacks, and while much research has focused on detecting…
We study regular expressions that use variables, or parameters, which are interpreted as alphabet letters. We consider two classes of languages denoted by such expressions: under the possibility semantics, a word belongs to the language if…
Optimizing compilers, as well as other translator systems, often work by rewriting expressions according to equivalence preserving rules. Given an input expression and its optimized form, finding the sequence of rules that were applied is a…
A recent paper by Drewes, Hoffmann, and Minas (GCM 2023 proceedings) has shown that certain graph languages can be defined and efficiently recognized by finite automata when strings over typed symbols are interpreted as graphs. This…
We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data…