Related papers: Generalization error for decision problems
At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…
Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator…
Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly…
In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the…
Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well…
This work performs a non-asymptotic analysis of the generalized Lasso under the assumption of sub-exponential data. Our main results continue recent research on the benchmark case of (sub-)Gaussian sample distributions and thereby explore…
This paper studies the generalization error of invariant classifiers. In particular, we consider the common scenario where the classification task is invariant to certain transformations of the input, and that the classifier is constructed…
Transformers serve as the foundational architecture for many successful large-scale models, demonstrating the ability to overfit the training data while maintaining strong generalization on unseen data, a phenomenon known as benign…
We investigate random feature models in which neural networks sampled from a prescribed initialization ensemble are frozen and used as random features, with only the readout weights optimized. Adopting a statistical-physics viewpoint, we…
We propose the family of generalized resubstitution classifier error estimators based on empirical measures. These error estimators are computationally efficient and do not require re-training of classifiers. The plain resubstitution error…
We consider high-dimensional estimation problems where the number of parameters diverges with the sample size. General conditions are established for consistency, uniqueness, and asymptotic normality in both unpenalized and penalized…
Generalised regression estimation allows one to make use of available auxiliary information in survey sampling. We develop three types of generalised regression estimator when the auxiliary data cannot be matched perfectly to the sample…
Generalization is at the core of machine learning models. However, the definition of generalization is not entirely clear. We employ set theory to introduce the concepts of algorithms, hypotheses, and dataset generalization. We analyze the…
In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically…
This paper is motivated by an open problem around deep networks, namely, the apparent absence of over-fitting despite large over-parametrization which allows perfect fitting of the training data. In this paper, we analyze this phenomenon in…
Calibration is a fundamental property of a good predictive model: it requires that the model predicts correctly in proportion to its confidence. Modern neural networks, however, provide no strong guarantees on their calibration -- and can…
The asymptotically precise estimation of the generalization of kernel methods has recently received attention due to the parallels between neural networks and their associated kernels. However, prior works derive such estimates for training…
Generalization error predictors (GEPs) aim to predict model performance on unseen distributions by deriving dataset-level error estimates from sample-level scores. However, GEPs often utilize disparate mechanisms (e.g., regressors,…
We introduce a new notion of generalization -- Distributional Generalization -- which roughly states that outputs of a classifier at train and test time are close *as distributions*, as opposed to close in just their average error. For…
Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still…