Related papers: Calibrating Structured Output Predictors for Natur…

Towards Improving Selective Prediction Ability of NLP Systems

It's better to say "I can't answer" than to answer incorrectly. This selective prediction ability is crucial for NLP systems to be reliably deployed in real-world applications. Prior work has shown that existing selective prediction…

Computation and Language · Computer Science 2022-04-08 Neeraj Varshney , Swaroop Mishra , Chitta Baral

Calibration of Neural Networks

Neural networks solving real-world problems are often required not only to make accurate predictions but also to provide a confidence level in the forecast. The calibration of a model indicates how close the estimated confidence is to the…

Neural and Evolutionary Computing · Computer Science 2023-03-21 Ruslan Vasilev , Alexander D'yakonov

Confidence Calibration for Convolutional Neural Networks Using Structured Dropout

In classification applications, we often want probabilistic predictions to reflect confidence or uncertainty. Dropout, a commonly used training technique, has recently been linked to Bayesian inference, yielding an efficient way to quantify…

Machine Learning · Computer Science 2019-06-25 Zhilu Zhang , Adrian V. Dalca , Mert R. Sabuncu

Calibrated Interpretation: Confidence Estimation in Semantic Parsing

Sequence generation models are increasingly being used to translate natural language into programs, i.e. to perform executable semantic parsing. The fact that semantic parsing aims to predict programs that can lead to executed actions in…

Computation and Language · Computer Science 2023-07-10 Elias Stengel-Eskin , Benjamin Van Durme

Multicalibration for Confidence Scoring in LLMs

This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously…

Machine Learning · Statistics 2024-04-09 Gianluca Detommaso , Martin Bertran , Riccardo Fogliato , Aaron Roth

Conformal Prediction for Natural Language Processing: A Survey

The rapid proliferation of large language models and natural language processing (NLP) applications creates a crucial need for uncertainty quantification to mitigate risks such as hallucinations and to enhance decision-making reliability in…

Computation and Language · Computer Science 2024-05-06 Margarida M. Campos , António Farinhas , Chrysoula Zerva , Mário A. T. Figueiredo , André F. T. Martins

Calibrate your listeners! Robust communication-based training for pragmatic speakers

To be good conversational partners, natural language processing (NLP) systems should be trained to produce contextually useful utterances. Prior work has investigated training NLP systems with communication-based objectives, where a neural…

Computation and Language · Computer Science 2021-10-12 Rose E. Wang , Julia White , Jesse Mu , Noah D. Goodman

Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis

Pre-trained language models (PLMs) have gained increasing popularity due to their compelling prediction performance in diverse natural language processing (NLP) tasks. When formulating a PLM-based prediction pipeline for NLP tasks, it is…

Computation and Language · Computer Science 2022-10-17 Yuxin Xiao , Paul Pu Liang , Umang Bhatt , Willie Neiswanger , Ruslan Salakhutdinov , Louis-Philippe Morency

Should We Simultaneously Calibrate Multiple Computer Models?

In an increasing number of applications designers have access to multiple computer models which typically have different levels of fidelity and cost. Traditionally, designers calibrate these models one at a time against some high-fidelity…

Machine Learning · Computer Science 2025-06-02 Jonathan Tammer Eweis-Labolle , Tyler Johnson , Xiangyu Sun , Ramin Bostanabad

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is…

Computation and Language · Computer Science 2024-10-28 Wataru Hashimoto , Hidetaka Kamigaito , Taro Watanabe

Uncertainty Quantification for Named Entity Recognition via Full-Sequence and Subsequence Conformal Prediction

Named Entity Recognition (NER) serves as a foundational component in many natural language processing (NLP) pipelines. However, current NER models typically output a single predicted label sequence without any accompanying measure of…

Computation and Language · Computer Science 2026-01-27 Matthew Singer , Srijan Sengupta , Karl Pazdernik

On Calibration of Large Language Models: From Response To Capability

Large language models (LLMs) are widely deployed as general-purpose problem solvers, making accurate confidence estimation critical for reliable use. Prior work on LLM calibration largely focuses on response-level confidence, which…

Computation and Language · Computer Science 2026-02-17 Sin-Han Yang , Cheng-Kuang Wu , Chieh-Yen Lin , Yun-Nung Chen , Hung-yi Lee , Shao-Hua Sun

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies find that PLMs often predict over-confidently. Although various calibration methods have been…

Computation and Language · Computer Science 2023-02-15 Jaeyoung Kim , Dongbin Na , Sungchul Choi , Sungbin Lim

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

A trustworthy real-world prediction system should produce well-calibrated confidence scores; that is, its confidence in an answer should be indicative of the likelihood that the answer is correct, enabling deferral to an expert in cases of…

Computation and Language · Computer Science 2023-10-25 Katherine Tian , Eric Mitchell , Allan Zhou , Archit Sharma , Rafael Rafailov , Huaxiu Yao , Chelsea Finn , Christopher D. Manning

On the Calibration of Large Language Models and Alignment

As large language models attract increasing attention and find widespread application, concurrent challenges of reliability also arise at the same time. Confidence calibration, an effective analysis method for gauging the reliability of…

Computation and Language · Computer Science 2023-11-23 Chiwei Zhu , Benfeng Xu , Quan Wang , Yongdong Zhang , Zhendong Mao

On Calibration of Modern Neural Networks

Confidence calibration -- the problem of predicting probability estimates representative of the true correctness likelihood -- is important for classification models in many applications. We discover that modern neural networks, unlike…

Machine Learning · Computer Science 2017-08-04 Chuan Guo , Geoff Pleiss , Yu Sun , Kilian Q. Weinberger

Posterior calibration and exploratory analysis for natural language processing models

Many models in natural language processing define probabilistic distributions over linguistic structures. We argue that (1) the quality of a model' s posterior distribution can and should be directly evaluated, as to whether probabilities…

Computation and Language · Computer Science 2015-09-03 Khanh Nguyen , Brendan O'Connor

Calibrating Long-form Generations from Large Language Models

To enhance Large Language Models' (LLMs) reliability, calibration is essential -- the model's assessed confidence scores should align with the actual likelihood of its responses being correct. However, current confidence elicitation methods…

Computation and Language · Computer Science 2024-10-29 Yukun Huang , Yixin Liu , Raghuveer Thirukovalluru , Arman Cohan , Bhuwan Dhingra

Confidence Calibration of Classifiers with Many Classes

For classification models based on neural networks, the maximum predicted class probability is often used as a confidence score. This score rarely predicts well the probability of making a correct prediction and requires a post-processing…

Machine Learning · Computer Science 2024-11-07 Adrien LeCoz , Stéphane Herbin , Faouzi Adjed

Quantifying Uncertainties in Natural Language Processing Tasks

Reliable uncertainty quantification is a first step towards building explainable, transparent, and accountable artificial intelligent systems. Recent progress in Bayesian deep learning has made such quantification realizable. In this paper,…

Computation and Language · Computer Science 2018-11-20 Yijun Xiao , William Yang Wang