Related papers: Spline-Based Probability Calibration
Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision making depends on the predicted probabilities. Measuring calibration error amounts to comparing two…
Generating confidence calibrated outputs is of utmost importance for the applications of deep neural networks in safety-critical decision-making systems. The output of a neural network is a probability distribution where the scores are…
The task of calibration is to retrospectively adjust the outputs from a machine learning model to provide better probability estimates on the target variable. While calibration has been investigated thoroughly in classification, it has not…
Many applications of classification methods not only require high accuracy but also reliable estimation of predictive uncertainty. However, while many current classification frameworks, in particular deep neural networks, achieve high…
Class probabilities predicted by most multiclass classifiers are uncalibrated, often tending towards over-confidence. With neural networks, calibration can be improved by temperature scaling, a method to learn a single corrective…
Many classification applications require accurate probability estimates in addition to good class separation but often classifiers are designed focusing only on the latter. Calibration is the process of improving probability estimates by…
Overconfidence and underconfidence in machine learning classifiers is measured by calibration: the degree to which the probabilities predicted for each class match the accuracy of the classifier on that prediction. How one measures…
This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously…
Consider a multi-class labelling problem, where the labels can take values in $[k]$, and a predictor predicts a distribution over the labels. In this work, we study the following foundational question: Are there notions of multi-class…
In many applications, accurate class probability estimates are required, but many types of models produce poor quality probability estimates despite achieving acceptable classification accuracy. Even though probability calibration has been…
For many applications of probabilistic classifiers it is important that the predicted confidence vectors reflect true probabilities (one says that the classifier is calibrated). It has been shown that common models fail to satisfy this…
Multi-class classification methods that produce sets of probabilistic classifiers, such as ensemble learning methods, are able to model aleatoric and epistemic uncertainty. Aleatoric uncertainty is then typically quantified via the Bayes…
Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty…
When facing uncertainty, decision-makers want predictions they can trust. A machine learning provider can convey confidence to decision-makers by guaranteeing their predictions are distribution calibrated -- amongst the inputs that receive…
Probabilistic classifiers output a probability distribution on target classes rather than just a class prediction. Besides providing a clear separation of prediction and decision making, the main advantage of probabilistic models is their…
Most supervised machine learning tasks are subject to irreducible prediction errors. Probabilistic predictive models address this limitation by providing probability distributions that represent a belief over plausible targets, rather than…
Calibration can reduce overconfident predictions of deep neural networks, but can calibration also accelerate training? In this paper, we show that it can when used to prioritize some examples for performing subset selection. We study the…
A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. This property has become increasingly important as the impact of machine…
Machine-generated probability predictions are essential in modern classification tasks such as image classification. A model is well calibrated when its predicted probabilities correspond to observed event frequencies. Despite the need for…
Deep neural network (DNN) classifiers are often overconfident, producing miscalibrated class probabilities. In high-risk applications like healthcare, practitioners require $\textit{fully calibrated}$ probability predictions for…