Related papers: Certainty-Validity: A Diagnostic Framework for Dis…

Trust, or Don't Predict: Introducing the CWSA Family for Confidence-Aware Model Evaluation

In recent machine learning systems, confidence scores are being utilized more and more to manage selective prediction, whereby a model can abstain from making a prediction when it is unconfident. Yet, conventional metrics like accuracy,…

Machine Learning · Computer Science 2025-05-27 Kourosh Shahnazari , Seyed Moein Ayyoubzadeh , Mohammadali Keshtparvar , Pegah Ghaffari

Cascading Robustness Verification: Toward Efficient Model-Agnostic Certification

Certifying neural network robustness against adversarial examples is challenging, as formal guarantees often require solving non-convex problems. Hence, incomplete verifiers are widely used because they scale efficiently and substantially…

Machine Learning · Computer Science 2026-02-05 Mohammadreza Maleki , Rushendra Sidibomma , Arman Adibi , Reza Samavi

Ambig-DS: A Benchmark for Task-Framing Ambiguity in Data-Science Agents

As data-science agents shift from co-pilots to auto-pilots, silent misframing becomes a critical failure mode. Agents quietly commit to plausible but unintended task framings, producing clean, executable artifacts that hide their incorrect…

Artificial Intelligence · Computer Science 2026-05-12 Josefa Lia Stoisser , Marc Boubnovski Martell , Sidsel Boldsen , Kaspar Märtens , Robert Kitchen

Uncertainty Aware Training to Improve Deep Learning Model Calibration for Classification of Cardiac MR Images

Quantifying uncertainty of predictions has been identified as one way to develop more trustworthy artificial intelligence (AI) models beyond conventional reporting of performance metrics. When considering their role in a clinical decision…

Image and Video Processing · Electrical Eng. & Systems 2023-08-30 Tareen Dawood , Chen Chen , Baldeep S. Sidhua , Bram Ruijsink , Justin Goulda , Bradley Porter , Mark K. Elliott , Vishal Mehta , Christopher A. Rinaldi , Esther Puyol-Anton , Reza Razavi , Andrew P. King

Uncertainty-Aware Transformers: Conformal Prediction for Language Models

Transformers have had a profound impact on the field of artificial intelligence, especially on large language models and their variants. However, as was the case with neural networks, their black-box nature limits trust and deployment in…

Machine Learning · Computer Science 2026-04-13 Abhiram Vellore , Niraj K. Jha

CHASE: Competing Hypotheses for Ambiguity-Aware Selective Prediction

Standard selective prediction methods typically estimate uncertainty from the output of a single predictive branch. While effective for general uncertainty estimation, these approaches often struggle under partial observability, where local…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Kartik Jhawar , Yuhao Geng , Atul N. Parikh , Lipo Wang

Approximate Cross-Validation for Structured Models

Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. A gold standard evaluation technique is structured…

Machine Learning · Statistics 2020-12-02 Soumya Ghosh , William T. Stephenson , Tin D. Nguyen , Sameer K. Deshpande , Tamara Broderick

Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset

A feature-stable and explainable machine learning framework for trustworthy decision-making under incomplete clinical data

Machine learning models are increasingly applied to biomedical data, yet their adoption in high stakes domains remains limited by poor robustness, limited interpretability, and instability of learned features under realistic data…

Machine Learning · Computer Science 2026-02-20 Justyna Andrys-Olek , Paulina Tworek , Luca Gherardini , Mark W. Ruddock , Mary Jo Kurt , Peter Fitzgerald , Jose Sousa

Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning

Machine learning (ML) systems are increasingly deployed in high-stakes domains where reliability is paramount. This thesis investigates how uncertainty estimation can enhance the safety and trustworthiness of ML, focusing on selective…

Machine Learning · Computer Science 2025-09-09 Stephan Rabanser

A General Framework for Uncertainty Estimation in Deep Learning

Neural networks predictions are unreliable when the input sample is out of the training distribution or corrupted by noise. Being able to detect such failures automatically is fundamental to integrate deep learning algorithms into robotics.…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Antonio Loquercio , Mattia Segù , Davide Scaramuzza

Beyond Uncertainty Quantification: Learning Uncertainty for Trust-Informed Neural Network Decisions - A Case Study in COVID-19 Classification

Reliable uncertainty quantification is critical in high-stakes applications, such as medical diagnosis, where confidently incorrect predictions can erode trust in automated decision-making systems. Traditional uncertainty quantification…

Image and Video Processing · Electrical Eng. & Systems 2025-10-21 Hassan Gharoun , Mohammad Sadegh Khorshidi , Fang Chen , Amir H. Gandomi

Trust, Geometry, and Rules: A Credibility-Aware Reinforcement Learning Framework for Safe USV Navigation under Uncertainty

Autonomous navigation of Unmanned Surface Vehicles (USVs) that is safe and compliant with the International Regulations for Preventing Collisions at Sea (COLREGs) remains a formidable challenge in dynamic maritime environments, particularly…

Robotics · Computer Science 2026-05-29 Yuhang Zhang , Shuqi Chai , Yukang Zhang , Liusha Yang , Mingchuan Zhang , Wei Wang , Qingjiang Shi , Quanbo Ge

Silent Commitment Failure in Instruction-Tuned Language Models: Evidence of Governability Divergence Across Architectures

As large language models are deployed as autonomous agents with tool execution privileges, a critical assumption underpins their security architecture: that model errors are detectable at runtime. We present empirical evidence that this…

Artificial Intelligence · Computer Science 2026-03-24 Gregory M. Ruddell

Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection

Camouflaged Object Detection (COD), the task of identifying objects concealed within their environments, has seen rapid growth due to its wide range of practical applications. A key step toward developing trustworthy COD systems is the…

Computer Vision and Pattern Recognition · Computer Science 2025-02-17 Ziyue Yang , Kehan Wang , Yuhang Ming , Yong Peng , Han Yang , Qiong Chen , Wanzeng Kong

Confident Learning: Estimating Uncertainty in Dataset Labels

Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and…

Machine Learning · Statistics 2022-08-23 Curtis G. Northcutt , Lu Jiang , Isaac L. Chuang

Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning

So-called unsupervised anomaly detection is better described as semi-supervised, as it assumes all training data are nominal. This assumption simplifies training but requires manual data curation, introducing bias and limiting adaptability.…

Computer Vision and Pattern Recognition · Computer Science 2025-10-29 Muhammad Aqeel , Shakiba Sharifi , Marco Cristani , Francesco Setti

An Ambiguity Measure for Recognizing the Unknowns in Deep Learning

We study the understanding of deep neural networks from the scope in which they are trained on. While the accuracy of these models is usually impressive on the aggregate level, they still make mistakes, sometimes on cases that appear to be…

Machine Learning · Computer Science 2023-12-12 Roozbeh Yousefzadeh

Calibrated Bayesian Deep Learning for Explainable Decision Support Systems Based on Medical Imaging

In critical decision support systems based on medical imaging, the reliability of AI-assisted decision-making is as relevant as predictive accuracy. Although deep learning models have demonstrated significant accuracy, they frequently…

Computer Vision and Pattern Recognition · Computer Science 2026-02-13 Hua Xu , Julián D. Arias-Londoño , Juan I. Godino-Llorente

Risk-Calibrated Learning: Minimizing Fatal Errors in Medical AI

Deep learning models often achieve expert-level accuracy in medical image classification but suffer from a critical flaw: semantic incoherence. These high-confidence mistakes that are semantically incoherent (e.g., classifying a malignant…

Computer Vision and Pattern Recognition · Computer Science 2026-04-15 Abolfazl Mohammadi-Seif , Ricardo Baeza-Yates