Related papers: Differential coverage: automating coverage analysi…

Diverse Complexity Measures for Dataset Curation in Self-driving

Modern self-driving autonomy systems heavily rely on deep learning. As a consequence, their performance is influenced significantly by the quality and richness of the training data. Data collecting platforms can generate many hours of raw…

Machine Learning · Computer Science 2021-01-19 Abbas Sadat , Sean Segal , Sergio Casas , James Tu , Bin Yang , Raquel Urtasun , Ersin Yumer

Online Bin Covering: Expectations vs. Guarantees

Bin covering is a dual version of classic bin packing. Thus, the goal is to cover as many bins as possible, where covering a bin means packing items of total size at least one in the bin. For online bin covering, competitive analysis fails…

Data Structures and Algorithms · Computer Science 2014-02-28 Marie G. Christ , Lene M. Favrholdt , Kim S. Larsen

CoverageBench: Evaluating Information Coverage across Tasks and Domains

We wish to measure the information coverage of an ad hoc retrieval algorithm, that is, how much of the range of available relevant information is covered by the search results. Information coverage is a central aspect for retrieval,…

Information Retrieval · Computer Science 2026-03-23 Saron Samuel , Andrew Yates , Dawn Lawrie , Ian Soboroff , Trevor Adriaanse , Benjamin Van Durme , Eugene Yang

Assessing and Remedying Coverage for a Given Dataset

Data analysis impacts virtually every aspect of our society today. Often, this analysis is performed on an existing dataset, possibly collected through a process that the data scientists had limited control over. The existing data analyzed…

Databases · Computer Science 2023-04-27 Abolfazl Asudeh , Zhongjun Jin , H. V. Jagadish

A Review of automatic differentiation and its efficient implementation

Derivatives play a critical role in computational statistics, examples being Bayesian inference using Hamiltonian Monte Carlo sampling and the training of neural networks. Automatic differentiation is a powerful tool to automate the…

Mathematical Software · Computer Science 2019-03-27 Charles C. Margossian

Dynamic Code Coverage with Progressive Detail Levels

Nowadays, locating software components responsible for observed failures is one of the most expensive and error-prone tasks in the software development process. To improve the debugging process efficiency, some effort was already made to…

Software Engineering · Computer Science 2013-06-20 Alexandre Perez

A Thorough Investigation of Content-Defined Chunking Algorithms for Data Deduplication

Data deduplication emerged as a powerful solution for reducing storage and bandwidth costs in cloud settings by eliminating redundancies at the level of chunks. This has spurred the development of numerous Content-Defined Chunking (CDC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-22 Marcel Gregoriadis , Leonhard Balduf , Björn Scheuermann , Johan Pouwelse

Quantifying and Improving Adaptivity in Conformal Prediction through Input Transformations

Conformal prediction constructs a set of labels instead of a single point prediction, while providing a probabilistic coverage guarantee. Beyond the coverage guarantee, adaptiveness to example difficulty is an important property. It means…

Machine Learning · Computer Science 2025-11-18 Sooyong Jang , Insup Lee

Bridging the Gap between Structural and Semantic Similarity in Diverse Planning

Diverse planning is the problem of finding multiple plans for a given problem specification, which is at the core of many real-world applications. For example, diverse planning is a critical piece for the efficiency of plan recognition…

Artificial Intelligence · Computer Science 2023-10-04 Mustafa F. Abdelwahed , Joan Espasa , Alice Toniolo , Ian P. Gent

On-the-fly Data Assessment for High Throughput X-ray Diffraction Measurement

Investment in brighter sources and larger and faster detectors has accelerated the speed of data acquisition at national user facilities. The accelerated data acquisition offers many opportunities for discovery of new materials, but it also…

Materials Science · Physics 2017-09-28 Fang Ren , Ronald Pandolfi , Douglas Van Campen , Alexander Hexemer , Apurva Mehta

A Primer on the Data Cleaning Pipeline

The availability of both structured and unstructured databases, such as electronic health data, social media data, patent data, and surveys that are often updated in real time, among others, has grown rapidly over the past decade. With this…

Databases · Computer Science 2023-07-26 Rebecca C. Steorts

DDUO: General-Purpose Dynamic Analysis for Differential Privacy

Differential privacy enables general statistical analysis of data with formal guarantees of privacy protection at the individual level. Tools that assist data analysts with utilizing differential privacy have frequently taken the form of…

Programming Languages · Computer Science 2021-03-17 Chike Abuah , Alex Silence , David Darais , Joe Near

A Comprehensive Guide to Differential Privacy: From Theory to User Expectations

The increasing availability of personal data has enabled significant advances in fields such as machine learning, healthcare, and cybersecurity. However, this data abundance also raises serious privacy concerns, especially in light of…

Cryptography and Security · Computer Science 2026-04-24 Napsu Karmitsa , Antti Airola , Tapio Pahikkala , Tinja Pitkämäki

Split Conformal Prediction under Data Contamination

Conformal prediction is a non-parametric technique for constructing prediction intervals or sets from arbitrary predictive models under the assumption that the data is exchangeable. It is popular as it comes with theoretical guarantees on…

Machine Learning · Statistics 2025-12-01 Jase Clarkson , Wenkai Xu , Mihai Cucuringu , Yvik Swan , Gesine Reinert

D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Centralized approaches for multi-robot coverage planning problems suffer from the lack of scalability. Learning-based distributed algorithms provide a scalable avenue in addition to bringing data-oriented feature generation capabilities to…

Robotics · Computer Science 2022-09-21 Vishnu Dutt Sharma , Lifeng Zhou , Pratap Tokekar

Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores

Standard conformal prediction offers a marginal guarantee on coverage, but for prediction sets to be truly useful, they should ideally ensure coverage conditional on each test point. Unfortunately, it is impossible to achieve exact,…

Machine Learning · Computer Science 2025-02-11 Jivat Neet Kaur , Michael I. Jordan , Ahmed Alaa

Batchwise Probabilistic Incremental Data Cleaning

Lack of data and data quality issues are among the main bottlenecks that prevent further artificial intelligence adoption within many organizations, pushing data scientists to spend most of their time cleaning data before being able to…

Databases · Computer Science 2020-11-11 Paulo H. Oliveira , Daniel S. Kaster , Caetano Traina-Jr. , Ihab F. Ilyas

Diversification Methods for Zero-One Optimization

We introduce new diversification methods for zero-one optimization that significantly extend strategies previously introduced in the setting of metaheuristic search. Our methods incorporate easily implemented strategies for partitioning…

Artificial Intelligence · Computer Science 2017-03-24 Fred Glover

Human-Centric Data Cleaning [Vision]

Data Cleaning refers to the process of detecting and fixing errors in the data. Human involvement is instrumental at several stages of this process, e.g., to identify and repair errors, to validate computed repairs, etc. There is currently…

Databases · Computer Science 2018-01-03 El Kindi Rezig , Mourad Ouzzani , Ahmed K. Elmagarmid , Walid G. Aref

A Unified Theory of Conditional Coverage in Conformal Prediction with Applications

Conformal prediction provides prediction sets with finite-sample marginal coverage, but many applications require coverage guarantees that adapt to individual test points, a subpopulation, or a structural component of the data. Existing…

Methodology · Statistics 2026-05-27 Yinjie Min , Liuhua Peng , Changliang Zou