English
Related papers

Related papers: Implicit Regularization in Deep Learning

200 papers

We argue that the optimization plays a crucial role in generalization of deep learning models through implicit regularization. We do this by demonstrating that generalization ability is not controlled by network size but rather by some…

Machine Learning · Computer Science 2017-05-10 Behnam Neyshabur , Ryota Tomioka , Ruslan Salakhutdinov , Nathan Srebro

Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may…

Machine Learning · Computer Science 2020-10-20 Noam Razin , Nadav Cohen

Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Wei Hu , Yuping Luo

Several works have shown that the regularization mechanisms underlying deep neural networks' generalization performances are still poorly understood. In this paper, we hypothesize that deep neural networks are regularized through their…

Machine Learning · Computer Science 2021-03-12 Carbonnelle Simon , Christophe De Vleeschouwer

From the statistical learning perspective, complexity control via explicit regularization is a necessity for improving the generalization of over-parameterized models. However, the impressive generalization performance of neural networks…

Machine Learning · Computer Science 2021-02-09 Taejong Joo , Uijung Chung

Works on implicit regularization have studied gradient trajectories during the optimization process to explain why deep networks favor certain kinds of solutions over others. In deep linear networks, it has been shown that gradient descent…

Machine Learning · Computer Science 2023-06-02 Dan Zhao

Deep learning systems are known to exhibit implicit regularization (alt. implicit bias), favoring simple solutions instead of merely minimizing the loss function. In some cases, we can analytically derive the implicit regularization --…

Machine Learning · Statistics 2026-05-08 Joseph H. Rudoler , Kevin Tan , Giles Hooker , Konrad P. Kording

Gradient-based deep-learning algorithms exhibit remarkable performance in practice, but it is not well-understood why they are able to generalize despite having more parameters than training examples. It is believed that implicit bias is a…

Machine Learning · Computer Science 2022-11-08 Gal Vardi

Modern deep learning models generalize remarkably well in-distribution, despite being overparametrized and trained with little to no explicit regularization. Instead, current theory credits implicit regularization imposed by the choice of…

Machine Learning · Computer Science 2026-03-17 Jonathan Wenger , Beau Coker , Juraj Marusic , John P. Cunningham

We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small…

Over-parameterized neural networks generalize well in practice without any explicit regularization. Although it has not been proven yet, empirical evidence suggests that implicit regularization plays a crucial role in deep learning and…

Machine Learning · Computer Science 2019-03-07 Masayoshi Kubo , Ryotaro Banno , Hidetaka Manabe , Masataka Minoji

Recent efforts to unravel the mystery of implicit regularization in deep learning have led to a theoretical focus on matrix factorization -- matrix completion via linear neural network. As a step further towards practical deep learning, we…

Machine Learning · Computer Science 2021-06-10 Noam Razin , Asaf Maman , Nadav Cohen

Data augmentation is used in machine learning to make the classifier invariant to label-preserving transformations. Usually this invariance is only encouraged implicitly by including a single augmented input during training. However,…

Machine Learning · Computer Science 2022-03-08 Aleksander Botev , Matthias Bauer , Soham De

In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep…

Machine Learning · Computer Science 2022-12-27 Benoit Dherin , Michael Munn , Mihaela Rosca , David G. T. Barrett

We study the implicit regularization effects of deep learning in tensor factorization. While implicit regularization in deep matrix and 'shallow' tensor factorization via linear and certain type of non-linear neural networks promotes…

Machine Learning · Computer Science 2022-07-27 Kais Hariz , Hachem Kadri , Stéphane Ayache , Maher Moakher , Thierry Artières

A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this…

Machine Learning · Computer Science 2025-12-19 Maria Matveev , Vit Fojtik , Hung-Hsu Chou , Gitta Kutyniok , Johannes Maly

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations…

Machine Learning · Statistics 2026-03-19 Max Schölpple , Liu Fanghui , Ingo Steinwart

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization,…

Machine Learning · Computer Science 2017-07-07 Behnam Neyshabur , Srinadh Bhojanapalli , David McAllester , Nathan Srebro

The notion of implicit bias, or implicit regularization, has been suggested as a means to explain the surprising generalization ability of modern-days overparameterized learning algorithms. This notion refers to the tendency of the…

Machine Learning · Computer Science 2020-12-23 Assaf Dauber , Meir Feder , Tomer Koren , Roi Livni

In the pursuit of explaining implicit regularization in deep learning, prominent focus was given to matrix and tensor factorizations, which correspond to simplified neural networks. It was shown that these models exhibit an implicit…

Machine Learning · Computer Science 2022-09-20 Noam Razin , Asaf Maman , Nadav Cohen
‹ Prev 1 2 3 10 Next ›