English
Related papers

Related papers: Evaluating Probabilistic Classifiers: The Triptych

200 papers

A probability forecast or probabilistic classifier is reliable or calibrated if the predicted probabilities are matched by ex post observed frequencies, as examined visually in reliability diagrams. The classical binning and counting…

Methodology · Statistics 2021-08-26 Timo Dimitriadis , Tilmann Gneiting , Alexander I. Jordan

A user-focused verification approach for evaluating probability forecasts of binary outcomes (also known as probabilistic classifiers) is demonstrated that is (i) based on proper scoring rules, (ii) focuses on user decision thresholds, and…

Applications · Statistics 2024-03-25 Nicholas Loveday , Robert Taggart , Mohammadreza Khanarmuei

Probabilistic classifiers output a probability distribution on target classes rather than just a class prediction. Besides providing a clear separation of prediction and decision making, the main advantage of probabilistic models is their…

Machine Learning · Computer Science 2019-02-20 Juozas Vaicenavicius , David Widmann , Carl Andersson , Fredrik Lindsten , Jacob Roll , Thomas B. Schön

The assessment of binary classifier performance traditionally centers on discriminative ability using metrics, such as accuracy. However, these metrics often disregard the model's inherent uncertainty, especially when dealing with sensitive…

Machine Learning · Computer Science 2024-02-13 Agathe Fernandes Machado , Arthur Charpentier , Emmanuel Flachaire , Ewen Gallic , François Hu

Analyzing classification model performance is a crucial task for machine learning practitioners. While practitioners often use count-based metrics derived from confusion matrices, like accuracy, many applications, such as weather…

Human-Computer Interaction · Computer Science 2022-07-29 Peter Xenopoulos , Joao Rulff , Luis Gustavo Nonato , Brian Barr , Claudio Silva

There are strong incentives to build models that demonstrate outstanding predictive performance on various datasets and benchmarks. We believe these incentives risk a narrow focus on models and on the performance metrics used to evaluate…

Machine Learning · Computer Science 2022-06-07 David Lovell , Dimity Miller , Jaiden Capra , Andrew Bradley

This paper explores the calibration of a classifier output score in binary classification problems. A calibrator is a function that maps the arbitrary classifier score, of a testing observation, onto $[0,1]$ to provide an estimate for the…

Machine Learning · Computer Science 2022-04-29 Waleed A. Yousef , Issa Traore , William Briguglio

In binary classification tasks, accurate representation of probabilistic predictions is essential for various real-world applications such as predicting payment defaults or assessing medical risks. The model must then be well-calibrated to…

Machine Learning · Computer Science 2024-08-08 Agathe Fernandes Machado , Arthur Charpentier , Emmanuel Flachaire , Ewen Gallic , François Hu

When providing probabilistic forecasts for uncertain future events, it is common to strive for calibrated forecasts, that is, the predictive distribution should be compatible with the observed outcomes. Several notions of calibration are…

Methodology · Statistics 2015-05-21 Christof Strähl , Johanna F. Ziegel

Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities…

Econometrics · Economics 2017-10-25 Pedro G. Fonseca , Hugo D. Lopes

Predictions are often probabilities; e.g., a prediction could be for precipitation tomorrow, but with only a 30% chance. Given such probabilistic predictions together with the actual outcomes, "reliability diagrams" help detect and diagnose…

Statistics Theory · Mathematics 2022-11-15 Imanol Arrieta-Ibarra , Paman Gujral , Jonathan Tannen , Mark Tygert , Cherie Xu

Model diagnostics and forecast evaluation are two sides of the same coin. A common principle is that fitted or predicted distributions ought to be calibrated or reliable, ideally in the sense of auto-calibration, where the outcome is a…

Methodology · Statistics 2024-09-27 Tilmann Gneiting , Johannes Resin

In the face of uncertainty, the need for probabilistic assessments has long been recognized in the literature on forecasting. In classification, however, comparative evaluation of classifiers often focuses on predictions specifying a single…

Methodology · Statistics 2023-05-31 Johannes Resin

Verification bias is a well-known problem that may occur in the evaluation of predictive ability of diagnostic tests. When a binary disease status is considered, various solutions can be found in the literature to correct inference based on…

Methodology · Statistics 2023-04-10 Khanh To Duc , Monica Chiogna , Gianfranco Adimari

The Receiver Operating Characteristic (ROC) curve of a binary classifier has often been utilized to measure the performance of the classifier. The area beneath this curve is used in particular because of its quoted probabilistic…

Machine Learning · Computer Science 2026-05-05 Steven Redolfi

Safety-critical prediction systems, such as autonomous vehicles, weather forecasters, and medical monitors, commonly rely on probabilistic forecasters. These forecasters make predictions about possible future outcomes, and their quality and…

Methodology · Statistics 2026-04-30 Romeo Valentin

A long noted difficulty when assessing the reliability (or calibration) of forecasting systems is that reliability, in general, is a hypothesis not about a finite dimensional parameter but about an entire functional relationship. A…

Data Analysis, Statistics and Probability · Physics 2020-12-09 Jochen Bröcker

Motivated by the Basel 3 regulations, recent studies have considered joint forecasts of Value-at-Risk and Expected Shortfall. A large family of scoring functions can be used to evaluate forecast performance in this context. However, little…

Risk Management · Quantitative Finance 2017-05-15 Johanna F. Ziegel , Fabian Krüger , Alexander Jordan , Fernando Fasciati

The key concepts (calibration, discrimination, and discordance) important in understanding and comparing risk models are best conveyed graphically. To illustrate this, models predicting death and acute kidney injury in a large cohort of PCI…

Quantitative Methods · Quantitative Biology 2015-04-21 Ralph H. Stern , Dean E. Smith , Hitinder S. Gurm

The Brier score conflates two distinct properties of probabilistic predictions: reliability (calibration error) and resolution (discriminatory power). We introduce the Manokhin Probability Matrix, a BCG-style two-dimensional diagnostic…

Machine Learning · Statistics 2026-05-06 Valery Manokhin
‹ Prev 1 2 3 10 Next ›