Related papers: Automatic Rule Generation for Time Expression Norm…
We describe TweeTIME, a temporal tagger for recognizing and normalizing time expressions in Twitter. Most previous work in social media analysis has to rely on temporal resolvers that are designed for well-edited text, and therefore suffer…
Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able…
Time normalization is the task of converting natural language temporal expressions into machine-readable representations. It underpins many downstream applications in information retrieval, question answering, and clinical decision-making.…
The detection and normalization of temporal expressions is an important task and preprocessing step for many applications. However, prior work on normalization is rule-based, which severely limits the applicability in real-world…
Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…
Timed regular expressions serve as a formalism for specifying real-time behaviors of Cyber-Physical Systems. In this paper, we consider the synthesis of timed regular expressions, focusing on generating a timed regular expression consistent…
We propose ARISE, a framework that iteratively induces rules and generates synthetic data for text classification. We combine synthetic data generation and automatic rule induction, via bootstrapping, to iteratively filter the generated…
This paper describes a temporal expression identification and normalization system, ManTIME, developed for the TempEval-3 challenge. The identification phase combines the use of conditional random fields along with a post-processing…
Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive. Indeed, computing the forward pass of such models involves solving an ODE which can become arbitrarily complex during training. Recent works have…
Human-designed rules are widely used to build industry applications. However, it is infeasible to maintain thousands of such hand-crafted rules. So it is very important to integrate the rule knowledge into neural networks to build a hybrid…
Abstractive text summarization is a highly difficult problem, and the sequence-to-sequence model has shown success in improving the performance on the task. However, the generated summaries are often inconsistent with the source content in…
Accurate prediction of human behavior is essential for robust and safe human-AI collaboration. However, existing approaches for modeling people are often data-hungry and brittle because they either make unrealistic assumptions about…
Temporal expression (TE) normalization is a well-studied problem. However, the predominately used rule-based systems are highly restricted to specific settings, and upcoming machine learning approaches suffer from a lack of labeled data. In…
Social media networks and chatting platforms often use an informal version of natural text. Adversarial spelling attacks also tend to alter the input text by modifying the characters in the text. Normalizing these texts is an essential step…
Finite-state tree automata are a well studied formalism for representing term languages. This paper studies the problem of determining the regularity of the set of instances of a finite set of terms with variables, where each variable is…
The generation of synthetic data is a promising technology to make medical data available for secondary use in a privacy-compliant manner. A popular method for creating realistic patient data is the rule-based Synthea data generator.…
Scene text recognition (STR) attracts much attention over the years because of its wide application. Most methods train STR model in a fully supervised manner which requires large amounts of labeled data. Although synthetic data contributes…
We formalize synthesis of shared control protocols with correctness guarantees for temporal logic specifications. More specifically, we introduce a modeling formalism in which both a human and an autonomy protocol can issue commands to a…
Regular expressions (regexes) are widely used in different fields of computer science, such as programming languages, string processing, and databases. However, existing tools for synthesizing or repairing regexes always assume that the…
We propose TRACIE, a novel temporal reasoning dataset that evaluates the degree to which systems understand implicit events -- events that are not mentioned explicitly in natural language text but can be inferred from it. This introduces a…