Related papers: Overparameterization and generalization error: wei…

Generalization error of minimum weighted norm and kernel interpolation

We study the generalization error of functions that interpolate prescribed data points and are selected by minimizing a weighted norm. Under natural and general conditions, we prove that both the interpolants and their generalization errors…

Numerical Analysis · Mathematics 2021-02-11 Weilin Li

Harmless interpolation of noisy data in regression

A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We…

Machine Learning · Computer Science 2019-09-10 Vidya Muthukumar , Kailas Vodrahalli , Vignesh Subramanian , Anant Sahai

Deep Learning Generalization, Extrapolation, and Over-parameterization

We study the generalization of over-parameterized deep networks (for image classification) in relation to the convex hull of their training sets. Despite their great success, generalization of deep networks is considered a mystery. These…

Machine Learning · Computer Science 2022-03-22 Roozbeh Yousefzadeh

Deep Double Descent via Smooth Interpolation

The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common…

Machine Learning · Computer Science 2023-04-11 Matteo Gamba , Erik Englesson , Mårten Björkman , Hossein Azizpour

Robust Implicit Regularization via Weight Normalization

Overparameterized models may have many interpolating solutions; implicit regularization refers to the hidden preference of a particular optimization method towards a certain interpolating solution among the many. A by now established line…

Machine Learning · Computer Science 2024-09-18 Hung-Hsu Chou , Holger Rauhut , Rachel Ward

A Universal Law of Robustness via Isoperimetry

Classically, data interpolation with a parametrized model class is possible as long as the number of parameters is larger than the number of equations to be satisfied. A puzzling phenomenon in deep learning is that models are trained with…

Machine Learning · Computer Science 2022-12-27 Sébastien Bubeck , Mark Sellke

Generalization Error of Generalized Linear Models in High Dimensions

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…

Machine Learning · Computer Science 2020-05-04 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

Benefit of Interpolation in Nearest Neighbor Algorithms

In some studies \citep[e.g.,][]{zhang2016understanding} of deep learning, it is observed that over-parametrized deep neural networks achieve a small testing error even when the training error is almost zero. Despite numerous works towards…

Machine Learning · Statistics 2022-02-25 Yue Xing , Qifan Song , Guang Cheng

Zero Generalization Error Theorem for Random Interpolators via Algebraic Geometry

We theoretically demonstrate that the generalization error of interpolators for machine learning models under teacher-student settings becomes 0 once the number of training samples exceeds a certain threshold. Understanding the high…

Machine Learning · Computer Science 2025-12-09 Naoki Yoshida , Isao Ishikawa , Masaaki Imaizumi

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized…

Machine Learning · Statistics 2021-09-07 Yehuda Dar , Vidya Muthukumar , Richard G. Baraniuk

Optimal generalisation and learning transition in extensive-width shallow neural networks near interpolation

We consider a teacher-student model of supervised learning with a fully-trained two-layer neural network whose width $k$ and input dimension $d$ are large and proportional. We provide an effective theory for approximating the Bayes-optimal…

Machine Learning · Statistics 2025-04-02 Jean Barbier , Francesco Camilli , Minh-Toan Nguyen , Mauro Pastore , Rudy Skerk

Double Descent and Other Interpolation Phenomena in GANs

We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the…

Machine Learning · Computer Science 2024-05-02 Lorenzo Luzi , Yehuda Dar , Richard Baraniuk

Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks

Neural networks trained via gradient descent with random initialization and without any regularization enjoy good generalization performance in practice despite being highly overparametrized. A promising direction to explain this phenomenon…

Machine Learning · Computer Science 2022-05-17 Hancheng Min , Salma Tarmoun , Rene Vidal , Enrique Mallada

A new approach to generalisation error of machine learning algorithms: Estimates and convergence

In this work we consider a model problem of deep neural learning, namely the learning of a given function when it is assumed that we have access to its point values on a finite set of points. The deep neural network interpolant is the the…

Machine Learning · Statistics 2023-06-27 Michail Loulakis , Charalambos G. Makridakis

Optimal Implicit Bias in Linear Regression

Most modern learning problems are over-parameterized, where the number of learnable parameters is much greater than the number of training data points. In this over-parameterized regime, the training loss typically has infinitely many…

Machine Learning · Computer Science 2025-06-23 Kanumuri Nithin Varma , Babak Hassibi

How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers

Background. A main theoretical puzzle is why over-parameterized Neural Networks (NNs) generalize well when trained to zero loss (i.e., so they interpolate the data). Usually, the NN is trained with Stochastic Gradient Descent (SGD) or one…

Machine Learning · Computer Science 2025-02-18 Gon Buzaglo , Itamar Harel , Mor Shpigel Nacson , Alon Brutzkus , Nathan Srebro , Daniel Soudry

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for "overfitted" / interpolated…

Machine Learning · Statistics 2018-10-29 Mikhail Belkin , Daniel Hsu , Partha Mitra

Memorize to Generalize: on the Necessity of Interpolation in High Dimensional Linear Regression

We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple…

Machine Learning · Statistics 2022-06-17 Chen Cheng , John Duchi , Rohith Kuditipudi

Double descent for least-squares interpolation on contaminated data: A simulation study

Overparametrized models can exhibit an excellent generalization performance, although they should be prone to overfitting according to classical statistical theory. The discovery of the "double descent", indicating that the generalization…

Machine Learning · Computer Science 2026-05-22 Tino Werner

The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training

Modern neural networks are often operated in a strongly overparametrized regime: they comprise so many parameters that they can interpolate the training set, even if actual labels are replaced by purely random ones. Despite this, they…

Machine Learning · Statistics 2022-06-10 Andrea Montanari , Yiqiao Zhong