Related papers: Calibrating Black Box Classification Models throug…

Optimizing Black-box Metrics with Iterative Example Weighting

We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix. Such black-box learning settings are ubiquitous, for example, when the learner only has query access to the metric of…

Machine Learning · Computer Science 2021-06-25 Gaurush Hiranandani , Jatin Mathur , Harikrishna Narasimhan , Mahdi Milani Fard , Oluwasanmi Koyejo

Dealing with Class Imbalance using Thresholding

We propose thresholding as an approach to deal with class imbalance. We define the concept of thresholding as a process of determining a decision boundary in the presence of a tunable parameter. The threshold is the maximum value of this…

Machine Learning · Computer Science 2016-07-12 Charmgil Hong , Rumi Ghosh , Soundar Srinivasan

From Black-box to White-box: Examining Confidence Calibration under different Conditions

Confidence calibration is a major concern when applying artificial neural networks in safety-critical applications. Since most research in this area has focused on classification in the past, confidence calibration in the scope of object…

Computer Vision and Pattern Recognition · Computer Science 2021-01-11 Franziska Schwaiger , Maximilian Henne , Fabian Küppers , Felippe Schmoeller Roza , Karsten Roscher , Anselm Haselhoff

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees. Our calibration algorithms work with any underlying model and (unknown) data-generating…

Machine Learning · Computer Science 2022-10-03 Anastasios N. Angelopoulos , Stephen Bates , Emmanuel J. Candès , Michael I. Jordan , Lihua Lei

Hidden Heterogeneity: When to Choose Similarity-Based Calibration

Trustworthy classifiers are essential to the adoption of machine learning predictions in many real-world settings. The predicted probability of possible outcomes can inform high-stakes decision making, particularly when assessing the…

Machine Learning · Computer Science 2023-02-22 Kiri L. Wagstaff , Thomas G. Dietterich

A Survey of Calibration Process for Black-Box LLMs

Large Language Models (LLMs) demonstrate remarkable performance in semantic understanding and generation, yet accurately assessing their output reliability remains a significant challenge. While numerous studies have explored calibration…

Artificial Intelligence · Computer Science 2024-12-18 Liangru Xie , Hui Liu , Jingying Zeng , Xianfeng Tang , Yan Han , Chen Luo , Jing Huang , Zhen Li , Suhang Wang , Qi He

Threshold Choice Methods: the Missing Link

Many performance metrics have been introduced for the evaluation of classification performance, with different origins and niches of application: accuracy, macro-accuracy, area under the ROC curve, the ROC convex hull, the absolute error,…

Artificial Intelligence · Computer Science 2012-01-31 José Hernández-Orallo , Peter Flach , Cèsar Ferri

Iterative hard-thresholding applied to optimal control problems with $L^0(\Omega)$ control cost

We investigate the hard-thresholding method applied to optimal control problems with $L^0(\Omega)$ control cost, which penalizes the measure of the support of the control. As the underlying measure space is non-atomic, arguments of…

Optimization and Control · Mathematics 2018-06-18 Daniel Wachsmuth

Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration

Proper confidence calibration of deep neural networks is essential for reliable predictions in safety-critical tasks. Miscalibration can lead to model over-confidence and/or under-confidence; i.e., the model's confidence in its prediction…

Machine Learning · Computer Science 2023-08-08 Shuang Ao , Stefan Rueger , Advaith Siddharthan

OTLP: Output Thresholding Using Mixed Integer Linear Programming

Output thresholding is the technique to search for the best threshold to be used during inference for any classifiers that can produce probability estimates on train and testing datasets. It is particularly useful in high imbalance…

Machine Learning · Computer Science 2024-05-21 Baran Koseoglu , Luca Traverso , Mohammed Topiwalla , Egor Kraev , Zoltan Szopory

Adaptive Thresholds for Monitoring and Screening in Imbalanced Samples: Optimality and Boosting Sensitivity

Suppose (standardized) measurements or statistics are monitored to raise an alarm when a threshold is exceeded. Often, the underlying population is heterogenous with respect to important discrete variables and thus samples may consist of…

Statistics Theory · Mathematics 2025-10-10 Ansgar Steland

Learning How to Optimize Black-Box Functions With Extreme Limits on the Number of Function Evaluations

We consider black-box optimization in which only an extremely limited number of function evaluations, on the order of around 100, are affordable and the function evaluations must be performed in even fewer batches of a limited number of…

Machine Learning · Computer Science 2021-03-19 Carlos Ansotegui , Meinolf Sellmann , Tapan Shah , Kevin Tierney

Non-Parametric Calibration for Classification

Many applications of classification methods not only require high accuracy but also reliable estimation of predictive uncertainty. However, while many current classification frameworks, in particular deep neural networks, achieve high…

Machine Learning · Computer Science 2020-02-28 Jonathan Wenger , Hedvig Kjellström , Rudolph Triebel

Adaptive label thresholding methods for online multi-label classification

Existing online multi-label classification works cannot well handle the online label thresholding problem and lack the regret analysis for their online algorithms. This paper proposes a novel framework of adaptive label thresholding…

Machine Learning · Computer Science 2022-11-15 Tingting Zhai , Hongcheng Tang , Hao Wang

Better Uncertainty Calibration via Proper Scores for Classification and Beyond

With model trustworthiness being crucial for sensitive real-world applications, practitioners are putting more and more focus on improving the uncertainty calibration of deep neural networks. Calibration errors are designed to quantify the…

Machine Learning · Computer Science 2024-03-14 Sebastian G. Gruber , Florian Buettner

Graduated Optimization of Black-Box Functions

Motivated by the problem of tuning hyperparameters in machine learning, we present a new approach for gradually and adaptively optimizing an unknown function using estimated gradients. We validate the empirical performance of the proposed…

Machine Learning · Computer Science 2019-06-05 Weijia Shao , Christian Geißler , Fikret Sivrikaya

Improved Trainable Calibration Method for Neural Networks on Medical Imaging Classification

Recent works have shown that deep neural networks can achieve super-human performance in a wide range of image classification tasks in the medical imaging domain. However, these works have primarily focused on classification accuracy,…

Computer Vision and Pattern Recognition · Computer Science 2020-09-10 Gongbo Liang , Yu Zhang , Xiaoqin Wang , Nathan Jacobs

New Hard-thresholding Rules based on Data Splitting in High-dimensional Imbalanced Classification

In binary classification, imbalance refers to situations in which one class is heavily under-represented. This issue is due to either a data collection process or because one class is indeed rare in a population. Imbalanced classification…

Methodology · Statistics 2022-01-07 Arezou Mojiri , Abbas Khalili , Ali Zeinal Hamadani

Instance-Wise Monotonic Calibration by Constrained Transformation

Deep neural networks often produce miscalibrated probability estimates, leading to overconfident predictions. A common approach for calibration is fitting a post-hoc calibration map on unseen validation data that transforms predicted…

Machine Learning · Computer Science 2025-07-10 Yunrui Zhang , Gustavo Batista , Salil S. Kanhere

Constrained Classification and Ranking via Quantiles

In most machine learning applications, classification accuracy is not the primary metric of interest. Binary classifiers which face class imbalance are often evaluated by the $F_\beta$ score, area under the precision-recall curve, Precision…

Machine Learning · Computer Science 2018-03-02 Alan Mackey , Xiyang Luo , Elad Eban