Related papers: Certifying cost annotations in compilers

Certifying and reasoning about cost annotations of functional programs

We present a so-called labelling method to insert cost annotations in a higher-order functional program, to certify their correctness with respect to a standard compilation chain to assembly code including safe memory management, and to…

Programming Languages · Computer Science 2013-01-17 Roberto M. Amadio , Yann Regis-Gianas

Indexed Labels for Loop Iteration Dependent Costs

We present an extension to the labelling approach, a technique for lifting resource consumption information from compiled to source code. This approach, which is at the core of the annotating compiler from a large fragment of C to 8051…

Programming Languages · Computer Science 2013-06-13 Paolo Tranquilli

Learning a Cost-Effective Annotation Policy for Question Answering

State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive. For this reason, customizing QA systems is challenging. As a remedy, we propose a novel framework…

Computation and Language · Computer Science 2020-11-10 Bernhard Kratzwald , Stefan Feuerriegel , Huan Sun

Label Smarter, Not Harder: CleverLabel for Faster Annotation of Ambiguous Image Classification with Higher Quality

High-quality data is crucial for the success of machine learning, but labeling large datasets is often a time-consuming and costly process. While semi-supervised learning can help mitigate the need for labeled data, label quality remains an…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Lars Schmarje , Vasco Grossmann , Tim Michels , Jakob Nazarenus , Monty Santarossa , Claudius Zelenka , Reinhard Koch

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

Large language models (LLMs) are increasingly positioned as scalable tools for annotating educational data, including classroom discourse, interaction logs, and qualitative learning artifacts. Their ability to rapidly summarize…

Artificial Intelligence · Computer Science 2026-03-17 Bakhtawar Ahtisham , Kirk Vanacore , Rene F. Kizilcec

Noisy Annotation Refinement for Object Detection

Supervised training of object detectors requires well-annotated large-scale datasets, whose production is costly. Therefore, some efforts have been made to obtain annotations in economical ways, such as cloud sourcing. However, datasets…

Computer Vision and Pattern Recognition · Computer Science 2021-12-08 Jiafeng Mao , Qing Yu , Yoko Yamakata , Kiyoharu Aizawa

The Accuracy Cost of Weakness: A Theoretical Analysis of Fixed-Segment Weak Labeling for Events in Time

Accurate labels are critical for deriving robust machine learning models. Labels are used to train supervised learning models and to evaluate most machine learning paradigms. In this paper, we model the accuracy and cost of a common weak…

Machine Learning · Computer Science 2025-09-30 John Martinsson , Tuomas Virtanen , Maria Sandsten , Olof Mogren

HyPAC: Cost-Efficient LLMs-Human Hybrid Annotation with PAC Error Guarantees

Data annotation often involves multiple sources with different cost-quality trade-offs, such as fast large language models (LLMs), slow reasoning models, and human experts. In this work, we study the problem of routing inputs to the most…

Machine Learning · Computer Science 2026-02-04 Hao Zeng , Huipeng Huang , Xinhao Qu , Jianguo Huang , Bingyi Jing , Hongxin Wei

Denotation-based Compositional Compiler Verification

A desired but challenging property of compiler verification is compositionality, in the sense that the compilation correctness of a program can be deduced incrementally from that of its substructures ranging from statements, functions, and…

Programming Languages · Computer Science 2026-03-31 Zhang Cheng , Jiyang Wu , Di Wang , Qinxiang Cao

Introducing Certified Compilation in Education by a Functional Language Approach

Classes on compiler technology are commonly found in Computer Science curricula, covering aspects of parsing, semantic analysis, intermediate transformations and target code generation. This paper reports on introducing certified…

Programming Languages · Computer Science 2019-06-28 Per Lindgren , Marcus Lindner , Nils Fitinghoff

MCAL: Minimum Cost Human-Machine Active Labeling

Today, ground-truth generation uses data sets annotated by cloud-based annotation services. These services rely on human annotation, which can be prohibitively expensive. In this paper, we consider the problem of hybrid human-machine…

Machine Learning · Computer Science 2023-02-28 Hang Qiu , Krishna Chintalapudi , Ramesh Govindan

From LLM-anation to LLM-orchestrator: Coordinating Small Models for Data Labeling

Although the annotation paradigm based on Large Language Models (LLMs) has made significant breakthroughs in recent years, its actual deployment still has two core bottlenecks: first, the cost of calling commercial APIs in large-scale…

Computation and Language · Computer Science 2025-06-23 Yao Lu , Zhaiyuan Ji , Jiawei Du , Yu Shanqing , Qi Xuan , Tianyi Zhou

Committee-Based Sample Selection for Probabilistic Classifiers

In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the…

Artificial Intelligence · Computer Science 2011-06-02 S. Argamon-Engelson , I. Dagan

Eliciting and Learning with Soft Labels from Every Annotator

The labels used to train machine learning (ML) models are of paramount importance. Typically for ML classification tasks, datasets contain hard labels, yet learning using soft labels has been shown to yield benefits for model…

Machine Learning · Computer Science 2022-08-31 Katherine M. Collins , Umang Bhatt , Adrian Weller

Analyzing Text Representations under Tight Annotation Budgets: Measuring Structural Alignment

Annotating large collections of textual data can be time consuming and expensive. That is why the ability to train models with limited annotation budgets is of great importance. In this context, it has been shown that under tight annotation…

Computation and Language · Computer Science 2022-10-13 César González-Gutiérrez , Audi Primadhanty , Francesco Cazzaro , Ariadna Quattoni

Visual Notations in Container Orchestrations: An Empirical Study with Docker Compose

Context: Container orchestration tools supporting infrastructure-as-code allow new forms of collaboration between developers and operatives. Still, their text-based nature permits naive mistakes and is more difficult to read as complexity…

Software Engineering · Computer Science 2022-07-20 Bruno Piedade , João Pedro Dias , Filipe F. Correia

Optimized Polynomial Evaluation with Semantic Annotations

In this paper we discuss how semantic annotations can be used to introduce mathematical algorithmic information of the underlying imperative code to enable compilers to produce code transformations that will enable better performance. By…

Programming Languages · Computer Science 2016-03-14 Daniel Rubio Bonilla , Colin W. Glass , Jan Kuper

Compute-Efficient Active Learning

Active learning, a powerful paradigm in machine learning, aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset. However, the traditional active learning process often demands extensive…

Machine Learning · Computer Science 2024-01-17 Gábor Németh , Tamás Matuszka

A Survey on Machine Learning Techniques for Auto Labeling of Video, Audio, and Text Data

Machine learning has been utilized to perform tasks in many different domains such as classification, object detection, image segmentation and natural language analysis. Data labeling has always been one of the most important tasks in…

Machine Learning · Computer Science 2021-09-09 Shikun Zhang , Omid Jafari , Parth Nagarkar

Clean or Annotate: How to Spend a Limited Data Collection Budget

Crowdsourcing platforms are often used to collect datasets for training machine learning models, despite higher levels of inaccurate labeling compared to expert labeling. There are two common strategies to manage the impact of such noise.…

Computation and Language · Computer Science 2022-06-14 Derek Chen , Zhou Yu , Samuel R. Bowman