Related papers: Why Unsupervised Deep Networks Generalize

Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability

The primary objective of learning methods is generalization. Classic uniform generalization bounds, which rely on VC-dimension or Rademacher complexity, fail to explain the significant attribute that over-parameterized models in deep…

Machine Learning · Computer Science 2025-03-07 Lijia Yu , Yibo Miao , Yifan Zhu , Xiao-Shan Gao , Lijun Zhang

Deep Learning Generalization, Extrapolation, and Over-parameterization

We study the generalization of over-parameterized deep networks (for image classification) in relation to the convex hull of their training sets. Despite their great success, generalization of deep networks is considered a mystery. These…

Machine Learning · Computer Science 2022-03-22 Roozbeh Yousefzadeh

The Low-Rank Simplicity Bias in Deep Networks

Modern deep neural networks are highly over-parameterized compared to the data on which they are trained, yet they often generalize remarkably well. A flurry of recent work has asked: why do deep networks not overfit to their training data?…

Machine Learning · Computer Science 2023-03-24 Minyoung Huh , Hossein Mobahi , Richard Zhang , Brian Cheung , Pulkit Agrawal , Phillip Isola

Deep learning generalizes because the parameter-function map is biased towards simple functions

Deep neural networks (DNNs) generalize remarkably well without explicit regularization even in the strongly over-parametrized regime where classical learning theory would instead predict that they would severely overfit. While many…

Machine Learning · Statistics 2019-04-23 Guillermo Valle-Pérez , Chico Q. Camargo , Ard A. Louis

Do highly over-parameterized neural networks generalize since bad solutions are rare?

We study over-parameterized classifiers where Empirical Risk Minimization (ERM) for learning leads to zero training error. In these over-parameterized settings there are many global minima with zero training error, some of which generalize…

Machine Learning · Computer Science 2023-12-05 Julius Martinetz , Thomas Martinetz

Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1)…

Machine Learning · Computer Science 2019-08-27 Tomaso Poggio , Andrzej Banburski , Qianli Liao

Explaining generalization in deep learning: progress and fundamental limits

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan

Understanding deep learning requires rethinking generalization

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the…

Machine Learning · Computer Science 2017-02-28 Chiyuan Zhang , Samy Bengio , Moritz Hardt , Benjamin Recht , Oriol Vinyals

Neural Networks and Polynomial Regression. Demystifying the Overparametrization Phenomena

In the context of neural network models, overparametrization refers to the phenomena whereby these models appear to generalize well on the unseen data, even though the number of parameters significantly exceeds the sample sizes, and the…

Machine Learning · Statistics 2020-03-25 Matt Emschwiller , David Gamarnik , Eren C. Kızıldağ , Ilias Zadik

Uniform convergence may be unable to explain generalization in deep learning

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan , J. Zico Kolter

On the Generalization Mystery in Deep Learning

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of…

Machine Learning · Computer Science 2022-06-07 Satrajit Chatterjee , Piotr Zielinski

Learning Regularization Parameters of Inverse Problems via Deep Neural Networks

In this work, we describe a new approach that uses deep neural networks (DNN) to obtain regularization parameters for solving inverse problems. We consider a supervised learning approach, where a network is trained to approximate the…

Numerical Analysis · Mathematics 2021-04-15 Babak Maboudi Afkham , Julianne Chung , Matthias Chung

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

It is well-known that modern neural networks are vulnerable to adversarial examples. To mitigate this problem, a series of robust learning algorithms have been proposed. However, although the robust training error can be near zero via some…

Machine Learning · Computer Science 2022-10-17 Binghui Li , Jikai Jin , Han Zhong , John E. Hopcroft , Liwei Wang

Regularizing linear inverse problems with convolutional neural networks

Deep convolutional neural networks trained on large datsets have emerged as an intriguing alternative for compressing images and solving inverse problems such as denoising and compressive sensing. However, it has only recently been realized…

Machine Learning · Computer Science 2019-07-09 Reinhard Heckel

Consistency for Large Neural Networks: Regression and Classification

Although overparameterized models have achieved remarkable practical success, their theoretical properties, particularly their generalization behavior, remain incompletely understood. The well known double descents phenomenon suggests that…

Machine Learning · Statistics 2026-01-06 Haoran Zhan , Yingcun Xia

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples. We systematically investigate the underlying reasons why deep neural networks…

Machine Learning · Computer Science 2017-11-29 Lei Wu , Zhanxing Zhu , Weinan E

Online Learning for the Random Feature Model in the Student-Teacher Framework

Deep neural networks are widely used prediction algorithms whose performance often improves as the number of weights increases, leading to over-parametrization. We consider a two-layered neural network whose first layer is frozen while the…

Machine Learning · Computer Science 2023-04-10 Roman Worschech , Bernd Rosenow

Theoretical Insight into Batch Normalization: Data Dependant Auto-Tuning of Regularization Rate

Batch normalization is widely used in deep learning to normalize intermediate activations. Deep networks suffer from notoriously increased training complexity, mandating careful initialization of weights, requiring lower learning rates,…

Machine Learning · Statistics 2022-10-19 Lakshmi Annamalai , Chetan Singh Thakur

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

Recent works have cast some light on the mystery of why deep nets fit any data and generalize despite being very overparametrized. This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and…

Machine Learning · Computer Science 2019-05-28 Sanjeev Arora , Simon S. Du , Wei Hu , Zhiyuan Li , Ruosong Wang

Learning and Generalization in Overparameterized Normalizing Flows

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using stochastic gradient descent with a sufficiently small learning rate and…

Machine Learning · Computer Science 2022-03-24 Kulin Shah , Amit Deshpande , Navin Goyal