Related papers: Geometric Regularization from Overparameterization

Regularization-wise double descent: Why it occurs and how to eliminate it

The risk of overparameterized models, in particular deep neural networks, is often double-descent shaped as a function of the model size. Recently, it was shown that the risk as a function of the early-stopping time can also be…

Machine Learning · Computer Science 2022-06-06 Fatih Furkan Yilmaz , Reinhard Heckel

Generalization Below the Edge of Stability: The Role of Data Geometry

Understanding generalization in overparameterized neural networks hinges on the interplay between the data geometry, neural architecture, and training dynamics. In this paper, we theoretically explore how data geometry controls this…

Machine Learning · Statistics 2026-05-08 Tongtong Liang , Alexander Cloninger , Rahul Parhi , Yu-Xiang Wang

A U-turn on Double Descent: Rethinking Parameter Counting in Statistical Learning

Conventional statistical wisdom established a well-understood relationship between model complexity and prediction error, typically presented as a U-shaped curve reflecting a transition between under- and overfitting regimes. However,…

Machine Learning · Statistics 2023-10-31 Alicia Curth , Alan Jeffares , Mihaela van der Schaar

Consistency for Large Neural Networks: Regression and Classification

Although overparameterized models have achieved remarkable practical success, their theoretical properties, particularly their generalization behavior, remain incompletely understood. The well known double descents phenomenon suggests that…

Machine Learning · Statistics 2026-01-06 Haoran Zhan , Yingcun Xia

Phenomenology of Double Descent in Finite-Width Neural Networks

`Double descent' delineates the generalization behaviour of models depending on the regime they belong to: under- or over-parameterized. The current theoretical understanding behind the occurrence of this phenomenon is primarily based on…

Machine Learning · Statistics 2022-03-15 Sidak Pal Singh , Aurelien Lucchi , Thomas Hofmann , Bernhard Schölkopf

Double Descent and Overparameterization in Particle Physics Data

Recently, the benefit of heavily overparameterized models has been observed in machine learning tasks: models with enough capacity to easily cross the \emph{interpolation threshold} improve in generalization error compared to the classical…

High Energy Physics - Experiment · Physics 2025-09-03 Matthias Vigl , Lukas Heinrich

On the Role of Optimization in Double Descent: A Least Squares Study

Empirically it has been observed that the performance of deep neural networks steadily improves as we increase model size, contradicting the classical view on overfitting and generalization. Recently, the double descent phenomena has been…

Machine Learning · Computer Science 2021-07-28 Ilja Kuzborskij , Csaba Szepesvári , Omar Rivasplata , Amal Rannen-Triki , Razvan Pascanu

Class Probability Estimation via Differential Geometric Regularization

We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to a robust…

Machine Learning · Computer Science 2016-02-12 Qinxun Bai , Steven Rosenberg , Zheng Wu , Stan Sclaroff

The Role of Symmetry in Optimizing Overparameterized Networks

Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely understood. We analyze weight-space symmetries in neural networks and show that overparameterization…

Machine Learning · Computer Science 2026-05-11 Kusha Sareen , Mohammad Pedramfar , Sékou-Oumar Kaba , Mehran Shakerinava , Siamak Ravanbakhsh

Robust Implicit Regularization via Weight Normalization

Overparameterized models may have many interpolating solutions; implicit regularization refers to the hidden preference of a particular optimization method towards a certain interpolating solution among the many. A by now established line…

Machine Learning · Computer Science 2024-09-18 Hung-Hsu Chou , Holger Rauhut , Rachel Ward

Implicit Regularization and Generalization in Overparameterized Neural Networks

Classical statistical learning theory predicts that overparameterized models should exhibit severe overfitting, yet modern deep neural networks with far more parameters than training samples consistently generalize well. This contradiction…

Machine Learning · Computer Science 2026-04-10 Zeran Johannsen

Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space

Double descent presents a counter-intuitive aspect within the machine learning domain, and researchers have observed its manifestation in various models and tasks. While some theoretical explanations have been proposed for this phenomenon…

Machine Learning · Computer Science 2024-04-26 Yufei Gu , Xiaoqing Zheng , Tomaso Aste

Analysis of Interpolating Regression Models and the Double Descent Phenomenon

A regression model with more parameters than data points in the training data is overparametrized and has the capability to interpolate the training data. Based on the classical bias-variance tradeoff expressions, it is commonly assumed…

Machine Learning · Computer Science 2023-04-18 Tomas McKelvey

Manipulating Sparse Double Descent

This paper investigates the double descent phenomenon in two-layer neural networks, focusing on the role of L1 regularization and representation dimensions. It explores an alternative double descent phenomenon, named sparse double descent.…

Machine Learning · Computer Science 2024-01-22 Ya Shi Zhang

Path Regularization: A Near-Complete and Optimal Nonasymptotic Generalization Theory for Multilayer Neural Networks and Double Descent Phenomenon

Path regularization has shown to be a very effective regularization to train neural networks, leading to a better generalization property than common regularizations i.e. weight decay, etc. We propose a first near-complete (as will be made…

Machine Learning · Computer Science 2026-04-09 Hao Yu

Multiple Descent: Design Your Own Generalization Curve

This paper explores the generalization loss of linear regression in variably parameterized families of models, both under-parameterized and over-parameterized. We show that the generalization curve can have an arbitrary number of peaks, and…

Machine Learning · Computer Science 2021-11-09 Lin Chen , Yifei Min , Mikhail Belkin , Amin Karbasi

VC Theoretical Explanation of Double Descent

There has been growing interest in generalization performance of large multilayer neural networks that can be trained to achieve zero training error, while generalizing well on test data. This regime is known as 'second descent' and it…

Machine Learning · Statistics 2022-09-30 Eng Hock Lee , Vladimir Cherkassky

Combining learning rate decay and weight decay with complexity gradient descent - Part I

The role of $L^2$ regularization, in the specific case of deep neural networks rather than more traditional machine learning models, is still not fully elucidated. We hypothesize that this complex interplay is due to the combination of…

Machine Learning · Computer Science 2019-02-11 Pierre H. Richemond , Yike Guo

Double descent for least-squares interpolation on contaminated data: A simulation study

Overparametrized models can exhibit an excellent generalization performance, although they should be prone to overfitting according to classical statistical theory. The discovery of the "double descent", indicating that the generalization…

Machine Learning · Computer Science 2026-05-22 Tino Werner

Double Descent and Other Interpolation Phenomena in GANs

We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the…

Machine Learning · Computer Science 2024-05-02 Lorenzo Luzi , Yehuda Dar , Richard Baraniuk