English
Related papers

Related papers: Deep learning generalizes because the parameter-fu…

200 papers

Modern deep neural networks are highly over-parameterized compared to the data on which they are trained, yet they often generalize remarkably well. A flurry of recent work has asked: why do deep networks not overfit to their training data?…

Machine Learning · Computer Science 2023-03-24 Minyoung Huh , Hossein Mobahi , Richard Zhang , Brian Cheung , Pulkit Agrawal , Phillip Isola

Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)---equipped with many more parameters than training data and trained by (stochastic) gradient-based methods---often achieve remarkably…

Machine Learning · Computer Science 2018-11-30 Zhiqin John Xu

Overparameterised deep neural networks (DNNs) are highly expressive and so can, in principle, generate almost any function that fits a training dataset with zero error. The vast majority of these functions will perform poorly on unseen…

Machine Learning · Computer Science 2021-04-13 Chris Mingard , Guillermo Valle-Pérez , Joar Skalse , Ard A. Louis

Deep neural networks are renowned for their ability to generalise well across diverse tasks, even when heavily overparameterized. Existing works offer only partial explanations (for example, the NTK-based task-model alignment explanation…

Machine Learning · Computer Science 2025-06-02 Chris Mingard , Lukas Seier , Niclas Göring , Andrei-Vlad Badelita , Charles London , Ard Louis

Deep neural networks generalize well despite being heavily overparameterized, in apparent contradiction with classical learning theory based on uniform convergence over fixed hypothesis spaces. Uniform bounds over the entire parameter space…

Machine Learning · Statistics 2026-05-15 Hubert Leroux , Jean Marcus , Julien Roger

This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification while obtaining…

Machine Learning · Computer Science 2023-06-01 Zhenyu Zhu , Fanghui Liu , Grigorios G Chrysos , Francesco Locatello , Volkan Cevher

Deep neural networks (DNNs) are typically optimized using various forms of mini-batch gradient descent algorithm. A major motivation for mini-batch gradient descent is that with a suitably chosen batch size, available computing resources…

Machine Learning · Computer Science 2022-10-25 Oyebade K. Oyedotun , Konstantinos Papadopoulos , Djamila Aouada

In this work, we describe a new approach that uses deep neural networks (DNN) to obtain regularization parameters for solving inverse problems. We consider a supervised learning approach, where a network is trained to approximate the…

Numerical Analysis · Mathematics 2021-04-15 Babak Maboudi Afkham , Julianne Chung , Matthias Chung

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan

Neural networks (NNs) are known to exhibit simplicity bias where they tend to prefer learning 'simple' features over more 'complex' ones, even when the latter may be more informative. Simplicity bias can lead to the model making biased…

Machine Learning · Computer Science 2023-10-11 Bhavya Vasudeva , Kameron Shahabi , Vatsal Sharan

In the past decade, deep neural networks (DNNs) came to the fore as the leading machine learning algorithms for a variety of tasks. Their raise was founded on market needs and engineering craftsmanship, the latter based more on trial and…

Machine Learning · Computer Science 2021-04-14 Omry Cohen , Or Malka , Zohar Ringel

In this paper we consider Deep Neural Networks (DNNs) with a smooth activation function as surrogates for high-dimensional functions that are somewhat smooth but costly to evaluate. We consider the standard (non-periodic) DNNs as well as…

Numerical Analysis · Mathematics 2026-03-04 Alexander Keller , Frances Y. Kuo , Dirk Nuyens , Ian H. Sloan

Deep neural networks often contain far more parameters than training examples, yet they still manage to generalize well in practice. Classical complexity measures such as VC-dimension or PAC-Bayes bounds usually become vacuous in this…

Machine Learning · Computer Science 2025-08-26 Aviral Dhingra

Background. A main theoretical puzzle is why over-parameterized Neural Networks (NNs) generalize well when trained to zero loss (i.e., so they interpolate the data). Usually, the NN is trained with Stochastic Gradient Descent (SGD) or one…

Machine Learning · Computer Science 2025-02-18 Gon Buzaglo , Itamar Harel , Mor Shpigel Nacson , Alon Brutzkus , Nathan Srebro , Daniel Soudry

Generalization is essential for deep learning. In contrast to previous works claiming that Deep Neural Networks (DNNs) have an implicit regularization implemented by the stochastic gradient descent, we demonstrate explicitly Bayesian…

Machine Learning · Computer Science 2019-10-23 Xinjie Lan , Kenneth E. Barner

We study the generalization of over-parameterized deep networks (for image classification) in relation to the convex hull of their training sets. Despite their great success, generalization of deep networks is considered a mystery. These…

Machine Learning · Computer Science 2022-03-22 Roozbeh Yousefzadeh

Deep learning methods are known to generalize well from training to future data, even in an overparametrized regime, where they could easily overfit. One explanation for this phenomenon is that even when their *ambient dimensionality*,…

Machine Learning · Computer Science 2025-05-22 Hossein Zakerinia , Dorsa Ghobadi , Christoph H. Lampert

Promising resolutions of the generalization puzzle observe that the actual number of parameters in a deep network is much smaller than naive estimates suggest. The renormalization group is a compelling example of a problem which has very…

Machine Learning · Computer Science 2020-12-08 Anita de Mello Koch , Ellen de Mello Koch , Robert de Mello Koch

We address the fundamental question of why deep neural networks generalize by establishing a pointwise generalization theory for fully connected networks. This framework resolves long-standing barriers to characterizing the rich nonlinear…

Machine Learning · Computer Science 2026-05-19 Shaojie Li , Yunbei Xu

Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of…

Machine Learning · Computer Science 2018-07-02 Amal Rannen Triki , Maxim Berman , Matthew B. Blaschko
‹ Prev 1 2 3 10 Next ›