Related papers: Generalization error for decision problems

Generalization Error of Generalized Linear Models in High Dimensions

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…

Machine Learning · Computer Science 2020-05-04 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

Small Sample Inference for Generalization Error in Classification Using the CUD Bound

Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator…

Machine Learning · Computer Science 2012-06-18 Eric B. Laber , Susan A. Murphy

Generalizations to Corrections for the Effects of Measurement Error in Approximately Consistent Methodologies

Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly…

Methodology · Statistics 2021-11-08 Dylan Spicker , Michael P Wallace , Grace Y Yi

Generalization Error Bounds for Noisy, Iterative Algorithms

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the…

Machine Learning · Computer Science 2018-01-16 Ankit Pensia , Varun Jog , Po-Ling Loh

Generalization Error for Linear Regression under Distributed Learning

Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well…

Machine Learning · Statistics 2020-05-05 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Generic Error Bounds for the Generalized Lasso with Sub-Exponential Data

This work performs a non-asymptotic analysis of the generalized Lasso under the assumption of sub-exponential data. Our main results continue recent research on the benchmark case of (sub-)Gaussian sample distributions and thereby explore…

Statistics Theory · Mathematics 2023-01-18 Martin Genzel , Christian Kipp

Generalization Error of Invariant Classifiers

This paper studies the generalization error of invariant classifiers. In particular, we consider the common scenario where the classification task is invariant to certain transformations of the input, and that the classifier is constructed…

Machine Learning · Statistics 2017-07-04 Jure Sokolic , Raja Giryes , Guillermo Sapiro , Miguel R. D. Rodrigues

Understanding Generalization in Transformers: Error Bounds and Training Dynamics Under Benign and Harmful Overfitting

Transformers serve as the foundational architecture for many successful large-scale models, demonstrating the ability to overfit the training data while maintaining strong generalization on unseen data, a phenomenon known as benign…

Machine Learning · Computer Science 2025-02-19 Yingying Zhang , Zhenyu Wu , Jian Li , Yong Liu

Loop Corrections to the Training Error and Generalization Gap of Random Feature Models

We investigate random feature models in which neural networks sampled from a prescribed initialization ensemble are frozen and used as random features, with only the readout weights optimized. Adopting a statistical-physics viewpoint, we…

Machine Learning · Computer Science 2026-04-29 Taeyoung Kim

Generalized Resubstitution for Classification Error Estimation

We propose the family of generalized resubstitution classifier error estimators based on empirical measures. These error estimators are computationally efficient and do not require re-training of classifiers. The plain resubstitution error…

Machine Learning · Statistics 2021-10-26 Parisa Ghane , Ulisses Braga-Neto

Asymptotics for estimating a diverging number of parameters -- with and without sparsity

We consider high-dimensional estimation problems where the number of parameters diverges with the sample size. General conditions are established for consistency, uniqueness, and asymptotic normality in both unpenalized and penalized…

Statistics Theory · Mathematics 2025-04-08 Jana Gauss , Thomas Nagler

Generalised regression estimation given imperfectly matched auxiliary data

Generalised regression estimation allows one to make use of available auxiliary information in survey sampling. We develop three types of generalised regression estimator when the auxiliary data cannot be matched perfectly to the sample…

Methodology · Statistics 2020-05-20 Li-Chun Zhang

Understanding Generalization via Set Theory

Generalization is at the core of machine learning models. However, the definition of generalization is not entirely clear. We employ set theory to introduce the concepts of algorithms, hypotheses, and dataset generalization. We analyze the…

Machine Learning · Computer Science 2023-11-14 Shiqi Liu

When Should You Adjust Standard Errors for Clustering?

In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically…

Statistics Theory · Mathematics 2022-09-21 Alberto Abadie , Susan Athey , Guido Imbens , Jeffrey Wooldridge

An analysis of training and generalization errors in shallow and deep networks

This paper is motivated by an open problem around deep networks, namely, the apparent absence of over-fitting despite large over-parametrization which allows perfect fitting of the training data. In this paper, we analyze this phenomenon in…

Machine Learning · Computer Science 2019-08-28 Hrushikesh Mhaskar , Tomaso Poggio

The Calibration Generalization Gap

Calibration is a fundamental property of a good predictive model: it requires that the model predicts correctly in proportion to its confidence. Modern neural networks, however, provide no strong guarantees on their calibration -- and can…

Machine Learning · Computer Science 2022-10-07 A. Michael Carrell , Neil Mallinar , James Lucas , Preetum Nakkiran

Generalization error of spectral algorithms

The asymptotically precise estimation of the generalization of kernel methods has recently received attention due to the parallels between neural networks and their associated kernels. However, prior works derive such estimates for training…

Machine Learning · Computer Science 2024-03-19 Maksim Velikanov , Maxim Panov , Dmitry Yarotsky

On the Efficacy of Generalization Error Prediction Scoring Functions

Generalization error predictors (GEPs) aim to predict model performance on unseen distributions by deriving dataset-level error estimates from sample-level scores. However, GEPs often utilize disparate mechanisms (e.g., regressors,…

Machine Learning · Computer Science 2023-05-30 Puja Trivedi , Danai Koutra , Jayaraman J. Thiagarajan

Distributional Generalization: A New Kind of Generalization

We introduce a new notion of generalization -- Distributional Generalization -- which roughly states that outputs of a classifier at train and test time are close *as distributions*, as opposed to close in just their average error. For…

Machine Learning · Computer Science 2020-10-16 Preetum Nakkiran , Yamini Bansal

Generalization Error in Deep Learning

Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still…

Machine Learning · Computer Science 2019-04-09 Daniel Jakubovitz , Raja Giryes , Miguel R. D. Rodrigues