English
Related papers

Related papers: Geometric Regularization from Overparameterization

200 papers

The risk of overparameterized models, in particular deep neural networks, is often double-descent shaped as a function of the model size. Recently, it was shown that the risk as a function of the early-stopping time can also be…

Machine Learning · Computer Science 2022-06-06 Fatih Furkan Yilmaz , Reinhard Heckel

Understanding generalization in overparameterized neural networks hinges on the interplay between the data geometry, neural architecture, and training dynamics. In this paper, we theoretically explore how data geometry controls this…

Machine Learning · Statistics 2026-05-08 Tongtong Liang , Alexander Cloninger , Rahul Parhi , Yu-Xiang Wang

Conventional statistical wisdom established a well-understood relationship between model complexity and prediction error, typically presented as a U-shaped curve reflecting a transition between under- and overfitting regimes. However,…

Machine Learning · Statistics 2023-10-31 Alicia Curth , Alan Jeffares , Mihaela van der Schaar

Although overparameterized models have achieved remarkable practical success, their theoretical properties, particularly their generalization behavior, remain incompletely understood. The well known double descents phenomenon suggests that…

Machine Learning · Statistics 2026-01-06 Haoran Zhan , Yingcun Xia

`Double descent' delineates the generalization behaviour of models depending on the regime they belong to: under- or over-parameterized. The current theoretical understanding behind the occurrence of this phenomenon is primarily based on…

Machine Learning · Statistics 2022-03-15 Sidak Pal Singh , Aurelien Lucchi , Thomas Hofmann , Bernhard Schölkopf

Recently, the benefit of heavily overparameterized models has been observed in machine learning tasks: models with enough capacity to easily cross the \emph{interpolation threshold} improve in generalization error compared to the classical…

High Energy Physics - Experiment · Physics 2025-09-03 Matthias Vigl , Lukas Heinrich

Empirically it has been observed that the performance of deep neural networks steadily improves as we increase model size, contradicting the classical view on overfitting and generalization. Recently, the double descent phenomena has been…

Machine Learning · Computer Science 2021-07-28 Ilja Kuzborskij , Csaba Szepesvári , Omar Rivasplata , Amal Rannen-Triki , Razvan Pascanu

We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to a robust…

Machine Learning · Computer Science 2016-02-12 Qinxun Bai , Steven Rosenberg , Zheng Wu , Stan Sclaroff

Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely understood. We analyze weight-space symmetries in neural networks and show that overparameterization…

Machine Learning · Computer Science 2026-05-11 Kusha Sareen , Mohammad Pedramfar , Sékou-Oumar Kaba , Mehran Shakerinava , Siamak Ravanbakhsh

Overparameterized models may have many interpolating solutions; implicit regularization refers to the hidden preference of a particular optimization method towards a certain interpolating solution among the many. A by now established line…

Machine Learning · Computer Science 2024-09-18 Hung-Hsu Chou , Holger Rauhut , Rachel Ward

Classical statistical learning theory predicts that overparameterized models should exhibit severe overfitting, yet modern deep neural networks with far more parameters than training samples consistently generalize well. This contradiction…

Machine Learning · Computer Science 2026-04-10 Zeran Johannsen

Double descent presents a counter-intuitive aspect within the machine learning domain, and researchers have observed its manifestation in various models and tasks. While some theoretical explanations have been proposed for this phenomenon…

Machine Learning · Computer Science 2024-04-26 Yufei Gu , Xiaoqing Zheng , Tomaso Aste

A regression model with more parameters than data points in the training data is overparametrized and has the capability to interpolate the training data. Based on the classical bias-variance tradeoff expressions, it is commonly assumed…

Machine Learning · Computer Science 2023-04-18 Tomas McKelvey

This paper investigates the double descent phenomenon in two-layer neural networks, focusing on the role of L1 regularization and representation dimensions. It explores an alternative double descent phenomenon, named sparse double descent.…

Machine Learning · Computer Science 2024-01-22 Ya Shi Zhang

Path regularization has shown to be a very effective regularization to train neural networks, leading to a better generalization property than common regularizations i.e. weight decay, etc. We propose a first near-complete (as will be made…

Machine Learning · Computer Science 2026-04-09 Hao Yu

This paper explores the generalization loss of linear regression in variably parameterized families of models, both under-parameterized and over-parameterized. We show that the generalization curve can have an arbitrary number of peaks, and…

Machine Learning · Computer Science 2021-11-09 Lin Chen , Yifei Min , Mikhail Belkin , Amin Karbasi

There has been growing interest in generalization performance of large multilayer neural networks that can be trained to achieve zero training error, while generalizing well on test data. This regime is known as 'second descent' and it…

Machine Learning · Statistics 2022-09-30 Eng Hock Lee , Vladimir Cherkassky

The role of $L^2$ regularization, in the specific case of deep neural networks rather than more traditional machine learning models, is still not fully elucidated. We hypothesize that this complex interplay is due to the combination of…

Machine Learning · Computer Science 2019-02-11 Pierre H. Richemond , Yike Guo

Overparametrized models can exhibit an excellent generalization performance, although they should be prone to overfitting according to classical statistical theory. The discovery of the "double descent", indicating that the generalization…

Machine Learning · Computer Science 2026-05-22 Tino Werner

We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the…

Machine Learning · Computer Science 2024-05-02 Lorenzo Luzi , Yehuda Dar , Richard Baraniuk
‹ Prev 1 2 3 10 Next ›