English
Related papers

Related papers: Deep Learning Generalization, Extrapolation, and O…

200 papers

We study the generalization of deep learning models in relation to the convex hull of their training sets. A trained image classifier basically partitions its domain via decision boundaries and assigns a class to each of those partitions.…

Machine Learning · Computer Science 2021-01-26 Roozbeh Yousefzadeh

There has been a long history of works showing that neural networks have hard time extrapolating beyond the training set. A recent study by Balestriero et al. (2021) challenges this view: defining interpolation as the state of belonging to…

Machine Learning · Computer Science 2022-07-19 Laurent Bonnasse-Gahot

Overparameterized deep networks that generalize well have been key to the dramatic success of deep learning in recent years. The reasons for their remarkable ability to generalize are not well understood yet. When class labels in the…

Machine Learning · Computer Science 2026-02-03 Simran Ketha , Venkatakrishnan Ramaswamy

Overparameterization, the condition where models have more parameters than necessary to fit their training loss, is a crucial factor for the success of deep learning. However, the characteristics of the features learned by overparameterized…

Machine Learning · Computer Science 2024-07-02 Ahmet Cagri Duzgun , Samy Jelassi , Yuanzhi Li

It is frequently observed that overparameterized neural networks generalize well. Regarding such phenomena, existing theoretical work mainly devotes to linear settings or fully-connected neural networks. This paper studies the learning…

Machine Learning · Statistics 2023-08-17 Tian-Yi Zhou , Xiaoming Huo

We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple…

Machine Learning · Statistics 2022-06-17 Chen Cheng , John Duchi , Rohith Kuditipudi

Modern deep neural networks are highly over-parameterized compared to the data on which they are trained, yet they often generalize remarkably well. A flurry of recent work has asked: why do deep networks not overfit to their training data?…

Machine Learning · Computer Science 2023-03-24 Minyoung Huh , Hossein Mobahi , Richard Zhang , Brian Cheung , Pulkit Agrawal , Phillip Isola

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…

Machine Learning · Computer Science 2020-05-04 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

In this work, we study over-parameterization as a necessary condition for having the ability for the models to extrapolate outside the convex hull of training set. We specifically, consider classification models, e.g., image classification…

Machine Learning · Computer Science 2022-03-22 Roozbeh Yousefzadeh

A recent paradigm views deep neural networks as discretizations of certain controlled ordinary differential equations, sometimes called neural ordinary differential equations. We make use of this perspective to link expressiveness of deep…

Optimization and Control · Mathematics 2020-07-20 Christa Cuchiero , Martin Larsson , Josef Teichmann

In the context of neural network models, overparametrization refers to the phenomena whereby these models appear to generalize well on the unseen data, even though the number of parameters significantly exceeds the sample sizes, and the…

Machine Learning · Statistics 2020-03-25 Matt Emschwiller , David Gamarnik , Eren C. Kızıldağ , Ilias Zadik

Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still…

Machine Learning · Computer Science 2019-04-09 Daniel Jakubovitz , Raja Giryes , Miguel R. D. Rodrigues

The neural network memorization problem is to study the expressive power of neural networks to interpolate a finite dataset. Although memorization is widely believed to have a close relationship with the strong generalizability of deep…

Machine Learning · Computer Science 2024-11-04 Lijia Yu , Xiao-Shan Gao , Lijun Zhang , Yibo Miao

The capacity to generalize beyond the range of training data is a pivotal challenge, often synonymous with a model's utility and robustness. This study investigates the comparative abilities of traditional machine learning (ML) models and…

Machine Learning · Computer Science 2024-03-05 Yong Yi Bay , Kathleen A. Yearick

Adversarial training is a widely used method to improve the robustness of deep neural networks (DNNs) over adversarial perturbations. However, it is empirically observed that adversarial training on over-parameterized networks often suffers…

Machine Learning · Statistics 2024-01-25 Zhongjie Shi , Fanghui Liu , Yuan Cao , Johan A. K. Suykens

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using stochastic gradient descent with a sufficiently small learning rate and…

Machine Learning · Computer Science 2022-03-24 Kulin Shah , Amit Deshpande , Navin Goyal

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan , J. Zico Kolter

The primary objective of learning methods is generalization. Classic uniform generalization bounds, which rely on VC-dimension or Rademacher complexity, fail to explain the significant attribute that over-parameterized models in deep…

Machine Learning · Computer Science 2025-03-07 Lijia Yu , Yibo Miao , Yifan Zhu , Xiao-Shan Gao , Lijun Zhang

The last decade has seen blossoming research in deep learning theory attempting to answer, "Why does deep learning generalize?" A powerful shift in perspective precipitated this progress: the study of overparametrized models in the…

Machine Learning · Statistics 2024-06-18 Patrik Reizinger , Szilvia Ujváry , Anna Mészáros , Anna Kerekes , Wieland Brendel , Ferenc Huszár
‹ Prev 1 2 3 10 Next ›