Related papers: Efficient Enumeration Algorithms for Annotated Gra…

Constant-delay enumeration for SLP-compressed documents

We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our…

Data Structures and Algorithms · Computer Science 2025-02-26 Martín Muñoz , Cristian Riveros

Automated Essay Scoring Incorporating Annotations from Automated Feedback Systems

This study illustrates how incorporating feedback-oriented annotations into the scoring pipeline can enhance the accuracy of automated essay scoring (AES). This approach is demonstrated with the Persuasive Essays for Rating, Selecting, and…

Computation and Language · Computer Science 2025-09-03 Christopher Ormerod

Grammars for Document Spanners

We propose a new grammar-based language for defining information-extractors from documents (text) that is built upon the well-studied framework of document spanners for extracting structured data from text. While previously studied…

Databases · Computer Science 2023-01-25 Liat Peterfreund

Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems

We propose a new evaluation for automatic solvers for algebra word problems, which can identify mistakes that existing evaluations overlook. Our proposal is to evaluate such solvers using derivations, which reflect how an equation system…

Computation and Language · Computer Science 2017-01-12 Shyam Upadhyay , Ming-Wei Chang

Optimal and Efficient Binary Questioning for Human-in-the-Loop Annotation

Even though data annotation is extremely important for interpretability, research and development of artificial intelligence solutions, most research efforts such as active learning or few-shot learning focus on the sample efficiency…

Machine Learning · Computer Science 2023-07-06 Franco Marchesoni-Acland , Jean-Michel Morel , Josselin Kherroubi , Gabriele Facciolo

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Under categorial grammars that have powerful rules like composition, a simple n-word sentence can have exponentially many parses. Generating all parses is inefficient and obscures whatever true semantic ambiguities are in the input. This…

cmp-lg · Computer Science 2008-02-03 Jason Eisner

Enumerating Regular Languages with Bounded Delay

We study the task, for a given language $L$, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the…

Formal Languages and Automata Theory · Computer Science 2023-01-10 Antoine Amarilli , Mikaël Monet

Extending an Event-type Ontology: Adding Verbs and Classes Using Fine-tuned LLMs Suggestions

In this project, we have investigated the use of advanced machine learning methods, specifically fine-tuned large language models, for pre-annotating data for a lexical extension task, namely adding descriptive words (verbs) to an existing…

Computation and Language · Computer Science 2023-08-11 Jana Straková , Eva Fučíková , Jan Hajič , Zdeňka Urešová

Improving Span-based Question Answering Systems with Coarsely Labeled Data

We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains.…

Computation and Language · Computer Science 2018-11-07 Hao Cheng , Ming-Wei Chang , Kenton Lee , Ankur Parikh , Michael Collins , Kristina Toutanova

Constant delay algorithms for regular document spanners

Regular expressions and automata models with capture variables are core tools in rule-based information extraction. These formalisms, also called regular document spanners, use regular languages in order to locate the data that a user wants…

Databases · Computer Science 2018-03-15 Fernando Florenzano , Cristian Riveros , Martin Ugarte , Stijn Vansummeren , Domagoj Vrgoc

Coarse-to-Fine Annotation Enrichment for Semantic Segmentation Learning

Rich high-quality annotated data is critical for semantic segmentation learning, yet acquiring dense and pixel-wise ground-truth is both labor- and time-consuming. Coarse annotations (e.g., scribbles, coarse polygons) offer an economical…

Computer Vision and Pattern Recognition · Computer Science 2018-08-29 Yadan Luo , Ziwei Wang , Zi Huang , Yang Yang , Cong Zhao

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that…

Computation and Language · Computer Science 2022-09-27 Jan-Christoph Klie , Bonnie Webber , Iryna Gurevych

Model-based annotation of coreference

Humans do not make inferences over texts, but over models of what texts are about. When annotators are asked to annotate coreferent spans of text, it is therefore a somewhat unnatural task. This paper presents an alternative in which we…

Computation and Language · Computer Science 2020-03-03 Rahul Aralikatte , Anders Søgaard

Argument Mining as a Text-to-Text Generation Task

Argument Mining(AM) aims to uncover the argumentative structures within a text. Previous methods require several subtasks, such as span identification, component classification, and relation classification. Consequently, these methods need…

Computation and Language · Computer Science 2026-03-26 Masayuki Kawarada , Tsutomu Hirao , Wataru Uchida , Masaaki Nagata

Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort

Large amounts of annotated data have become more important than ever, especially since the rise of deep learning techniques. However, manual annotations are costly. We propose a tool that enables researchers to create large, high-quality,…

Digital Libraries · Computer Science 2021-12-23 Franziska Weeber , Felix Hamborg , Karsten Donnay , Bela Gipp

A Neural Model for Regular Grammar Induction

Grammatical inference is a classical problem in computational learning theory and a topic of wider influence in natural language processing. We treat grammars as a model of computation and propose a novel neural approach to induction of…

Machine Learning · Computer Science 2022-10-04 Peter Belcák , David Hofer , Roger Wattenhofer

EduBERT: Pretrained Deep Language Models for Learning Analytics

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to…

Computers and Society · Computer Science 2019-12-03 Benjamin Clavié , Kobi Gal

Fine-Grained Perspectives: Modeling Explanations with Annotator-Specific Rationales

Beyond exploring disaggregated labels for modeling perspectives, annotator rationales provide fine-grained signals of individual perspectives. In this work, we propose a framework for jointly modeling annotator-specific label prediction and…

Computation and Language · Computer Science 2026-04-24 Olufunke O. Sarumi , Charles Welch , Daniel Braun

Structured Prediction with Output Embeddings for Semantic Image Annotation

We address the task of annotating images with semantic tuples. Solving this problem requires an algorithm which is able to deal with hundreds of classes for each argument of the tuple. In such contexts, data sparsity becomes a key…

Computer Vision and Pattern Recognition · Computer Science 2015-09-08 Ariadna Quattoni , Arnau Ramisa , Pranava Swaroop Madhyastha , Edgar Simo-Serra , Francesc Moreno-Noguer

Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection

Search methods based on Pretrained Language Models (PLM) have demonstrated great effectiveness gains compared to statistical and early neural ranking models. However, fine-tuning PLM-based rankers requires a great amount of annotated…

Information Retrieval · Computer Science 2023-09-13 Sophia Althammer , Guido Zuccon , Sebastian Hofstätter , Suzan Verberne , Allan Hanbury