Related papers: Can Model Uncertainty Function as a Proxy for Mult…

Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of Model Uncertainty for Question Difficulty Estimation

In an educational setting, an estimate of the difficulty of multiple-choice questions (MCQs), a commonly used strategy to assess learning progress, constitutes very useful information for both teachers and students. Since human assessment…

Computation and Language · Computer Science 2025-04-21 Leonidas Zotos , Hedderik van Rijn , Malvina Nissim

Conformal Prediction with Large Language Models for Multi-Choice Question Answering

As large language models continue to be widely developed, robust uncertainty quantification techniques will become crucial for their safe deployment in high-stakes scenarios. In this work, we explore how conformal prediction can be used to…

Computation and Language · Computer Science 2023-07-11 Bhawesh Kumar , Charlie Lu , Gauri Gupta , Anil Palepu , David Bellamy , Ramesh Raskar , Andrew Beam

A little about models

We discuss several aspects of creation of adequate mathematical models in other sciences. In particular, many difficulties stem from great complexity of the source systems and the presence of a variety of uncertain factors. We illustrate…

Optimization and Control · Mathematics 2021-02-19 I. V. Konnov

Asking the Right Question at the Right Time: Human and Model Uncertainty Guidance to Ask Clarification Questions

Clarification questions are an essential dialogue tool to signal misunderstanding, ambiguities, and under-specification in language use. While humans are able to resolve uncertainty by asking questions since childhood, modern dialogue…

Computation and Language · Computer Science 2024-02-12 Alberto Testoni , Raquel Fernández

NLP Methods May Actually Be Better Than Professors at Estimating Question Difficulty

Estimating the difficulty of exam questions is essential for developing good exams, but professors are not always good at this task. We compare various Large Language Model-based methods with three professors in their ability to estimate…

Computation and Language · Computer Science 2025-11-18 Leonidas Zotos , Ivo Pascal de Jong , Matias Valdenegro-Toro , Andreea Ioana Sburlea , Malvina Nissim , Hedderik van Rijn

Difficulty as a Proxy for Measuring Intrinsic Cognitive Load Item

Cognitive load is key to ensuring an optimal learning experience. However, measuring the cognitive load of educational tasks typically relies on self-report measures which has been criticized by researchers for being subjective. In this…

Human-Computer Interaction · Computer Science 2025-07-18 Minghao Cai , Guher Gorgun , Carrie Demmans Epp

Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?

Large language models (LLMs) have delivered significant breakthroughs across diverse domains but can still produce unreliable or misleading outputs, posing critical challenges for real-world applications. While many recent studies focus on…

Computation and Language · Computer Science 2025-09-08 Yang Nan , Pengfei He , Ravi Tandon , Han Xu

Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making…

Computation and Language · Computer Science 2025-02-03 JoonHo Lee , Jae Oh Woo , Juree Seok , Parisa Hassanzadeh , Wooseok Jang , JuYoun Son , Sima Didari , Baruch Gutow , Heng Hao , Hankyu Moon , Wenjun Hu , Yeong-Dae Kwon , Taehee Lee , Seungjai Min

Take Out Your Calculators: Estimating the Real Difficulty of Question Items with LLM Student Simulations

Standardized math assessments require expensive human pilot studies to establish the difficulty of test items. We investigate the predictive value of open-source large language models (LLMs) for evaluating the difficulty of multiple-choice…

Computation and Language · Computer Science 2026-04-22 Christabel Acquaye , Yi Ting Huang , Marine Carpuat , Rachel Rudinger

Model Analysis & Evaluation for Ambiguous Question Answering

Ambiguous questions are a challenge for Question Answering models, as they require answers that cover multiple interpretations of the original query. To this end, these models are required to generate long-form answers that often combine…

Computation and Language · Computer Science 2023-05-23 Konstantinos Papakostas , Irene Papadopoulou

A Review and Classification of Model Uncertainty

Model uncertainty is a crucial issue in statistics, econometrics and machine learning, yet its definition remains ambiguous and is subject to various interpretations in the literature. So far, there has not been a universally accepted…

Methodology · Statistics 2025-08-12 Guangyuan Cui , Yuting Wei , Xinyu Zhang

Binary classification models with "Uncertain" predictions

Binary classification models which can assign probabilities to categories such as "the tissue is 75% likely to be tumorous" or "the chemical is 25% likely to be toxic" are well understood statistically, but their utility as an input to…

Applications · Statistics 2017-12-05 Damjan Krstajic , Ljubomir Buturovic , Simon Thomas , David E Leahy

Characterizing Sources of Uncertainty to Proxy Calibration and Disambiguate Annotator and Data Bias

Supporting model interpretability for complex phenomena where annotators can legitimately disagree, such as emotion recognition, is a challenging machine learning task. In this work, we show that explicitly quantifying the uncertainty in…

Machine Learning · Computer Science 2019-10-08 Asma Ghandeharioun , Brian Eoff , Brendan Jou , Rosalind W. Picard

Unified Uncertainties: Combining Input, Data and Model Uncertainty into a Single Formulation

Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We…

Machine Learning · Computer Science 2024-06-28 Matias Valdenegro-Toro , Ivo Pascal de Jong , Marco Zullich

Explaining Predictive Uncertainty by Looking Back at Model Explanations

Predictive uncertainty estimation of pre-trained language models is an important measure of how likely people can trust their predictions. However, little is known about what makes a model prediction uncertain. Explaining predictive…

Computation and Language · Computer Science 2022-10-11 Hanjie Chen , Wanyu Du , Yangfeng Ji

Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence

Question answering models can use rich knowledge sources -- up to one hundred retrieved passages and parametric knowledge in the large-scale language model (LM). Prior work assumes information in such knowledge sources is consistent with…

Computation and Language · Computer Science 2022-10-26 Hung-Ting Chen , Michael J. Q. Zhang , Eunsol Choi

Uncertainty estimation for classification and risk prediction on medical tabular data

In a data-scarce field such as healthcare, where models often deliver predictions on patients with rare conditions, the ability to measure the uncertainty of a model's prediction could potentially lead to improved effectiveness of decision…

Machine Learning · Statistics 2020-05-26 Lotta Meijerink , Giovanni Cinà , Michele Tonutti

System-Level Uncertainty Quantification with Multiple Machine Learning Models: A Theoretical Framework

ML models have errors when used for predictions. The errors are unknown but can be quantified by model uncertainty. When multiple ML models are trained using the same training points, their model uncertainties may be statistically…

Machine Learning · Statistics 2025-09-23 Xiaoping Du

Positional Bias in Binary Question Answering: How Uncertainty Shapes Model Preferences

Positional bias in binary question answering occurs when a model systematically favors one choice over another based solely on the ordering of presented options. In this study, we quantify and analyze positional bias across five large…

Computation and Language · Computer Science 2025-07-02 Tiziano Labruna , Simone Gallo , Giovanni Da San Martino

Manipulating and Measuring Model Interpretability

With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in developing interpretable models. Although many supposedly interpretable models have been proposed,…

Artificial Intelligence · Computer Science 2021-08-17 Forough Poursabzi-Sangdeh , Daniel G. Goldstein , Jake M. Hofman , Jennifer Wortman Vaughan , Hanna Wallach