Related papers: Small Sample Inference for Generalization Error in…

Revisiting Confidence Estimation: Towards Reliable Failure Prediction

Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from…

Computer Vision and Pattern Recognition · Computer Science 2024-03-06 Fei Zhu , Xu-Yao Zhang , Zhen Cheng , Cheng-Lin Liu

Generalization error for decision problems

In this entry we review the generalization error for classification and single-stage decision problems. We distinguish three alternative definitions of the generalization error which have, at times, been conflated in the statistics…

Methodology · Statistics 2018-12-21 Eric B. Laber , Min Qian

Confidence Sets under Generalized Self-Concordance

This paper revisits a fundamental problem in statistical inference from a non-asymptotic theoretical viewpoint $\unicode{x2013}$ the construction of confidence sets. We establish a finite-sample bound for the estimator, characterizing its…

Statistics Theory · Mathematics 2023-01-03 Lang Liu , Zaid Harchaoui

Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins

A problem of bounding the generalization error of a classifier f in H, where H is a "base" class of functions (classifiers), is considered. This problem frequently occurs in computer learning, where efficient algorithms of combining simple…

Probability · Mathematics 2007-06-13 Vladimir Koltchinskii , Dmitry Panchenko , Fernando Lozano

Generalized Adversarial Distances to Efficiently Discover Classifier Errors

Given a black-box classification model and an unlabeled evaluation dataset from some application domain, efficient strategies need to be developed to evaluate the model. Random sampling allows a user to estimate metrics like accuracy,…

Machine Learning · Computer Science 2021-02-26 Walter Bennette , Sally Dufek , Karsten Maurer , Sean Sisti , Bunyod Tusmatov

Improving Confidence Estimates for Unfamiliar Examples

Intuitively, unfamiliarity should lead to lack of confidence. In reality, current algorithms often make highly confident yet wrong predictions when faced with relevant but unfamiliar examples. A classifier we trained to recognize gender is…

Computer Vision and Pattern Recognition · Computer Science 2020-09-09 Zhizhong Li , Derek Hoiem

Optimality of Training/Test Size and Resampling Effectiveness of Cross-Validation Estimators of the Generalization Error

An important question in constructing Cross Validation (CV) estimators of the generalization error is whether rules can be established that allow "optimal" selection of the size of the training set, for fixed sample size $n$. We define the…

Statistics Theory · Mathematics 2015-11-11 Georgios Afendras , Marianthi Markatou

Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

As neural network classifiers are deployed in real-world applications, it is crucial that their failures can be detected reliably. One practical solution is to assign confidence scores to each prediction, then use these scores to filter out…

Machine Learning · Computer Science 2022-01-04 Xin Qiu , Risto Miikkulainen

In Search of Robust Measures of Generalization

One of the principal scientific challenges in deep learning is explaining generalization, i.e., why the particular way the community now trains networks to achieve small training error also leads to small error on held-out data from the…

Machine Learning · Computer Science 2021-01-22 Gintare Karolina Dziugaite , Alexandre Drouin , Brady Neal , Nitarshan Rajkumar , Ethan Caballero , Linbo Wang , Ioannis Mitliagkas , Daniel M. Roy

A Statistical Model for Predicting Generalization in Few-Shot Classification

The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to…

Machine Learning · Computer Science 2023-03-29 Yassir Bendou , Vincent Gripon , Bastien Pasdeloup , Lukas Mauch , Stefan Uhlich , Fabien Cardinaux , Ghouthi Boukli Hacene , Javier Alonso Garcia

Cheap Subsampling bootstrap confidence intervals for fast and robust inference

Bootstrapping is often applied to get confidence limits for semiparametric inference of a target parameter in the presence of nuisance parameters. Bootstrapping with replacement can be computationally expensive and problematic when…

Methodology · Statistics 2025-03-06 Johan Sebastian Ohlendorff , Anders Munch , Kathrine Kold Sørensen , Thomas Alexander Gerds

Evaluating machine learning models in non-standard settings: An overview and new findings

Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and…

Machine Learning · Statistics 2023-10-24 Roman Hornung , Malte Nalenz , Lennart Schneider , Andreas Bender , Ludwig Bothmann , Bernd Bischl , Thomas Augustin , Anne-Laure Boulesteix

When to Trust Confidence Thresholding: Calibration Diagnostics for Pseudo-Labelled Regression

Calibrated probability outputs of trained classifiers are increasingly used as inputs to downstream regression estimands such as effects, prevalences, or disparities for a latent group observed only on a small labelled subset. A standard…

Methodology · Statistics 2026-05-14 Marcell T. Kurbucz

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a…

Machine Learning · Computer Science 2017-11-16 Dhruv Mahajan , Vivek Gupta , S Sathiya Keerthi , Sellamanickam Sundararajan , Shravan Narayanamurthy , Rahul Kidambi

Generalizations to Corrections for the Effects of Measurement Error in Approximately Consistent Methodologies

Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly…

Methodology · Statistics 2021-11-08 Dylan Spicker , Michael P Wallace , Grace Y Yi

Good Classifiers are Abundant in the Interpolating Regime

Within the machine learning community, the widely-used uniform convergence framework has been used to answer the question of how complex, over-parameterized models can generalize well to new data. This approach bounds the test error of the…

Machine Learning · Statistics 2021-03-05 Ryan Theisen , Jason M. Klusowski , Michael W. Mahoney

Generalized Resubstitution for Classification Error Estimation

We propose the family of generalized resubstitution classifier error estimators based on empirical measures. These error estimators are computationally efficient and do not require re-training of classifiers. The plain resubstitution error…

Machine Learning · Statistics 2021-10-26 Parisa Ghane , Ulisses Braga-Neto

Applications of the Fractional-Random-Weight Bootstrap

The bootstrap, based on resampling, has, for several decades, been a widely used method for computing confidence intervals for applications where no exact method is available and when sample sizes are not large enough to be able to rely on…

Applications · Statistics 2018-08-27 Chris Gotwalt , Li Xu , Yili Hong , William Q. Meeker

Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark Study

When assessing the quality of prediction models in machine learning, confidence intervals (CIs) for the generalization error, which measures predictive performance, are a crucial tool. Luckily, there exist many methods for computing such…

Machine Learning · Statistics 2025-01-16 Hannah Schulz-Kümpel , Sebastian Fischer , Roman Hornung , Anne-Laure Boulesteix , Thomas Nagler , Bernd Bischl

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Stochastic gradient descent (SGD) or stochastic approximation has been widely used in model training and stochastic optimization. While there is a huge literature on analyzing its convergence, inference on the obtained solutions from SGD…

Machine Learning · Statistics 2026-04-01 Henry Lam , Zitong Wang