English
Related papers

Related papers: Implicit Regularization Leads to Benign Overfittin…

200 papers

The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider…

Machine Learning · Statistics 2022-06-08 Peter L. Bartlett , Philip M. Long , Gábor Lugosi , Alexander Tsigler

Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent. To better understand this empirical observation, we…

Machine Learning · Computer Science 2025-07-04 Spencer Frei , Niladri S. Chatterji , Peter L. Bartlett

Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss; yet surprisingly, they possess near-optimal prediction performance, contradicting classical learning theory. We…

Machine Learning · Statistics 2021-06-08 Zhu Li , Zhi-Hua Zhou , Arthur Gretton

The recent success of neural network models has shone light on a rather surprising statistical phenomenon: statistical models that perfectly fit noisy data can generalize well to unseen test data. Understanding this phenomenon of…

Machine Learning · Statistics 2022-09-13 Niladri S. Chatterji , Philip M. Long , Peter L. Bartlett

The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data while attaining near-optimal expected loss, has received much attention in recent years, but still remains not fully understood beyond well-specified…

Machine Learning · Computer Science 2023-04-18 Ohad Shamir

Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing…

Machine Learning · Computer Science 2024-07-04 Yoav Wald , Gal Yona , Uri Shalit , Yair Carmon

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for "overfitted" / interpolated…

Machine Learning · Statistics 2018-10-29 Mikhail Belkin , Daniel Hsu , Partha Mitra

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite…

Statistics Theory · Mathematics 2021-03-17 Peter L. Bartlett , Andrea Montanari , Alexander Rakhlin

The practical success of overparameterized neural networks has motivated the recent scientific study of interpolating methods, which perfectly fit their training data. Certain interpolating methods, including neural networks, can fit noisy…

Machine Learning · Computer Science 2024-07-17 Neil Mallinar , James B. Simon , Amirhesam Abedsoltan , Parthe Pandit , Mikhail Belkin , Preetum Nakkiran

The literature on "benign overfitting" in overparameterized models has been mostly restricted to regression or binary classification; however, modern machine learning operates in the multiclass setting. Motivated by this discrepancy, we…

Machine Learning · Statistics 2023-07-13 Ke Wang , Vidya Muthukumar , Christos Thrampoulidis

Most modern learning problems are over-parameterized, where the number of learnable parameters is much greater than the number of training data points. In this over-parameterized regime, the training loss typically has infinitely many…

Machine Learning · Computer Science 2025-06-23 Kanumuri Nithin Varma , Babak Hassibi

Modern machine learning models are often trained in a setting where the number of parameters exceeds the number of training samples. To understand the implicit bias of gradient descent in such overparameterized models, prior work has…

Machine Learning · Statistics 2025-10-29 Hannes Matt , Dominik Stöger

In the absence of explicit regularization, Kernel "Ridgeless" Regression with nonlinear kernels has the potential to fit the training data perfectly. It has been observed empirically, however, that such interpolated solutions can still…

Statistics Theory · Mathematics 2020-07-27 Tengyuan Liang , Alexander Rakhlin

Gradient descent can be surprisingly good at optimizing deep neural networks without overfitting and without explicit regularization. We find that the discrete steps of gradient descent implicitly regularize models by penalizing gradient…

Machine Learning · Computer Science 2022-07-20 David G. T. Barrett , Benoit Dherin

Modern deep learning has revealed a surprising statistical phenomenon known as benign overfitting, with high-dimensional linear regression being a prominent example. This paper contributes to ongoing research on the ordinary least squares…

Statistics Theory · Mathematics 2024-11-12 Letian Yang , Dennis Shen

Benign overfitting is a phenomenon in machine learning where a model perfectly fits (interpolates) the training data, including noisy examples, yet still generalizes well to unseen data. Understanding this phenomenon has attracted…

Machine Learning · Computer Science 2025-05-20 Junhyung Park , Patrick Bloebaum , Shiva Prasad Kasiviswanathan

Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined…

Statistics Theory · Mathematics 2022-02-15 Peng Zhao , Yun Yang , Qiao-Chu He

A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this…

Machine Learning · Computer Science 2025-12-19 Maria Matveev , Vit Fojtik , Hung-Hsu Chou , Gitta Kutyniok , Johannes Maly

An evolving line of machine learning works observe empirical evidence that suggests interpolating estimators -- the ones that achieve zero training error -- may not necessarily be harmful. This paper pursues theoretical understanding for an…

Statistics Theory · Mathematics 2021-10-19 Yue Li , Yuting Wei

The Ridgeless minimum $\ell_2$-norm interpolator in overparametrized linear regression has attracted considerable attention in recent years in both machine learning and statistics communities. While it seems to defy conventional wisdom that…

Statistics Theory · Mathematics 2026-01-21 Qiyang Han , Xiaocong Xu
‹ Prev 1 2 3 10 Next ›