English
Related papers

Related papers: Fitting Elephants

200 papers

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for "overfitted" / interpolated…

Machine Learning · Statistics 2018-10-29 Mikhail Belkin , Daniel Hsu , Partha Mitra

A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We…

Machine Learning · Computer Science 2019-09-10 Vidya Muthukumar , Kailas Vodrahalli , Vignesh Subramanian , Anant Sahai

The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common…

Machine Learning · Computer Science 2023-04-11 Matteo Gamba , Erik Englesson , Mårten Björkman , Hossein Azizpour

A common strategy to train deep neural networks (DNNs) is to use very large architectures and to train them until they (almost) achieve zero training error. Empirically observed good generalization performance on test data, even in the…

Machine Learning · Statistics 2021-07-26 Nicole Mücke , Ingo Steinwart

In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I…

Machine Learning · Statistics 2021-06-01 Mikhail Belkin

The over-parameterized models attract much attention in the era of data science and deep learning. It is empirically observed that although these models, e.g. deep neural networks, over-fit the training data, they can still achieve small…

Machine Learning · Statistics 2019-09-27 Yue Xing , Qifan Song , Guang Cheng

In the era of deep learning, understanding over-fitting phenomenon becomes increasingly important. It is observed that carefully designed deep neural networks achieve small testing error even when the training error is close to zero. One…

Machine Learning · Statistics 2018-12-04 Yue Xing , Qifan Song , Guang Cheng

In many modern applications of deep learning the neural network has many more parameters than the data points used for its training. Motivated by those practices, a large body of recent theoretical research has been devoted to studying…

Statistics Theory · Mathematics 2022-12-07 A. Tsigler , P. L. Bartlett

Background. A main theoretical puzzle is why over-parameterized Neural Networks (NNs) generalize well when trained to zero loss (i.e., so they interpolate the data). Usually, the NN is trained with Stochastic Gradient Descent (SGD) or one…

Machine Learning · Computer Science 2025-02-18 Gon Buzaglo , Itamar Harel , Mor Shpigel Nacson , Alon Brutzkus , Nathan Srebro , Daniel Soudry

In some studies \citep[e.g.,][]{zhang2016understanding} of deep learning, it is observed that over-parametrized deep neural networks achieve a small testing error even when the training error is almost zero. Despite numerous works towards…

Machine Learning · Statistics 2022-02-25 Yue Xing , Qifan Song , Guang Cheng

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of…

Machine Learning · Computer Science 2024-10-28 Itamar Harel , William M. Hoza , Gal Vardi , Itay Evron , Nathan Srebro , Daniel Soudry

The practical success of overparameterized neural networks has motivated the recent scientific study of interpolating methods, which perfectly fit their training data. Certain interpolating methods, including neural networks, can fit noisy…

Machine Learning · Computer Science 2024-07-17 Neil Mallinar , James B. Simon , Amirhesam Abedsoltan , Parthe Pandit , Mikhail Belkin , Preetum Nakkiran

Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing…

Machine Learning · Computer Science 2024-07-04 Yoav Wald , Gal Yona , Uri Shalit , Yair Carmon

The widespread success of deep neural networks has revealed a surprise in classical machine learning: very complex models often generalize well while simultaneously overfitting training data. This phenomenon of benign overfitting has been…

Quantum Physics · Physics 2023-12-20 Evan Peters , Maria Schuld

The literature on "benign overfitting" in overparameterized models has been mostly restricted to regression or binary classification; however, modern machine learning operates in the multiclass setting. Motivated by this discrepancy, we…

Machine Learning · Statistics 2023-07-13 Ke Wang , Vidya Muthukumar , Christos Thrampoulidis

Understanding how overparameterized neural networks generalize despite perfect interpolation of noisy training data is a fundamental question. Mallinar et. al. 2022 noted that neural networks seem to often exhibit ``tempered overfitting'',…

Machine Learning · Computer Science 2024-03-25 Nirmit Joshi , Gal Vardi , Nathan Srebro

The recent success of neural network models has shone light on a rather surprising statistical phenomenon: statistical models that perfectly fit noisy data can generalize well to unseen test data. Understanding this phenomenon of…

Machine Learning · Statistics 2022-09-13 Niladri S. Chatterji , Philip M. Long , Peter L. Bartlett

Imitation learning considerably simplifies policy synthesis compared to alternative approaches by exploiting access to expert demonstrations. For such imitation policies, errors away from the training samples are particularly critical. Even…

Machine Learning · Computer Science 2024-03-19 Kaustubh Sridhar , Souradeep Dutta , Dinesh Jayaraman , James Weimer , Insup Lee

In this work we consider a model problem of deep neural learning, namely the learning of a given function when it is assumed that we have access to its point values on a finite set of points. The deep neural network interpolant is the the…

Machine Learning · Statistics 2023-06-27 Michail Loulakis , Charalambos G. Makridakis

Modern approaches to supervised learning like deep neural networks (DNNs) typically implicitly assume that observed responses are statistically independent. In contrast, correlated data are prevalent in real-life large-scale applications,…

Machine Learning · Statistics 2023-01-30 Giora Simchoni , Saharon Rosset
‹ Prev 1 2 3 10 Next ›