Related papers: EarlyStopping: Implicit Regularization for Iterati…

NYTRO: When Subsampling Meets Early Stopping

Early stopping is a well known approach to reduce the time complexity for performing training and model selection of large scale learning machines. On the other hand, memory/space (rather than time) complexity is the main constraint in many…

Machine Learning · Statistics 2018-02-02 Tomas Angles , Raffaello Camoriano , Alessandro Rudi , Lorenzo Rosasco

Early stopping for kernel boosting algorithms: A general analysis with localized complexities

Early stopping of iterative algorithms is a widely-used form of regularization in statistics, commonly used in conjunction with boosting and related gradient-type algorithms. Although consistency results have been established in some…

Machine Learning · Statistics 2018-03-15 Yuting Wei , Fanny Yang , Martin J. Wainwright

Early Stopping without a Validation Set

Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split…

Machine Learning · Computer Science 2017-06-07 Maren Mahsereci , Lukas Balles , Christoph Lassner , Philipp Hennig

Estimating Implicit Regularization in Deep Learning

Deep learning systems are known to exhibit implicit regularization (alt. implicit bias), favoring simple solutions instead of merely minimizing the loss function. In some cases, we can analytically derive the implicit regularization --…

Machine Learning · Statistics 2026-05-08 Joseph H. Rudoler , Kevin Tan , Giles Hooker , Konrad P. Kording

Early Stopping for Nonparametric Testing

Early stopping of iterative algorithms is an algorithmic regularization method to avoid over-fitting in estimation and classification. In this paper, we show that early stopping can also be applied to obtain the minimax optimal testing in a…

Statistics Theory · Mathematics 2018-09-18 Meimei Liu , Guang Cheng

Early-Learning Regularization Prevents Memorization of Noisy Labels

We propose a novel framework to perform classification via deep learning in the presence of noisy annotations. When trained on noisy labels, deep neural networks have been observed to first fit the training data with clean labels during an…

Machine Learning · Computer Science 2020-10-26 Sheng Liu , Jonathan Niles-Weed , Narges Razavian , Carlos Fernandez-Granda

Don't relax: early stopping for convex regularization

We consider the problem of designing efficient regularization algorithms when regularization is encoded by a (strongly) convex functional. Unlike classical penalization methods based on a relaxation approach, we propose an iterative method…

Optimization and Control · Mathematics 2017-07-19 Simon Matet , Lorenzo Rosasco , Silvia Villa , Bang Long Vu

Implicit Sparse Regularization: The Impact of Depth and Early Stopping

In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-N networks, under…

Machine Learning · Statistics 2021-10-28 Jiangyuan Li , Thanh V. Nguyen , Chinmay Hegde , Raymond K. W. Wong

Conformal inference is (almost) free for neural networks trained with early stopping

Early stopping based on hold-out data is a popular regularization technique designed to mitigate overfitting and increase the predictive accuracy of neural networks. Models trained with early stopping often provide relatively accurate…

Machine Learning · Statistics 2023-06-28 Ziyi Liang , Yanfei Zhou , Matteo Sesia

Early Stopping of Untrained Convolutional Neural Networks

In recent years, new regularization methods based on (deep) neural networks have shown very promising empirical performance for the numerical solution of ill-posed problems, e.g., in medical imaging and imaging science. Due to the…

Numerical Analysis · Mathematics 2024-06-07 Tim Jahn , Bangti Jin

Iterative Regularization with k-support Norm: An Important Complement to Sparse Recovery

Sparse recovery is ubiquitous in machine learning and signal processing. Due to the NP-hard nature of sparse recovery, existing methods are known to suffer either from restrictive (or even unknown) applicability conditions, or high…

Signal Processing · Electrical Eng. & Systems 2024-03-21 William de Vazelhes , Bhaskar Mukhoty , Xiao-Tong Yuan , Bin Gu

How does Early Stopping Help Generalization against Label Noise?

Noisy labels are very common in real-world training data, which lead to poor generalization on test data because of overfitting to the noisy labels. In this paper, we claim that such overfitting can be avoided by "early stopping" training a…

Machine Learning · Computer Science 2020-09-09 Hwanjun Song , Minseok Kim , Dongmin Park , Jae-Gil Lee

GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling

Machine learning models are often learned by minimising a loss function on the training data using a gradient descent algorithm. These models often suffer from overfitting, leading to a decline in predictive performance on unseen data. A…

Machine Learning · Computer Science 2026-01-28 Arash Jamshidi , Lauri Seppäläinen , Katsiaryna Haitsiukevich , Hoang Phuc Hau Luu , Anton Björklund , Kai Puolamäki

Instance-dependent Early Stopping

In machine learning practice, early stopping has been widely used to regularize models and can save computational costs by halting the training process when the model's performance on a validation set stops improving. However, conventional…

Machine Learning · Computer Science 2025-02-12 Suqin Yuan , Runqi Lin , Lei Feng , Bo Han , Tongliang Liu

Iterative regularization for low complexity regularizers

Iterative regularization exploits the implicit bias of an optimization algorithm to regularize ill-posed problems. Constructing algorithms with such built-in regularization mechanisms is a classic challenge in inverse problems but also in…

Optimization and Control · Mathematics 2022-02-02 Cesare Molinari , Mathurin Massias , Lorenzo Rosasco , Silvia Villa

Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often…

Machine Learning · Computer Science 2020-06-09 Cong Ma , Kaizheng Wang , Yuejie Chi , Yuxin Chen

RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization

While Reinforcement Learning for Verifiable Rewards (RLVR) is powerful for training large reasoning models, its training dynamics harbor a critical challenge: RL overfitting, where models gain training rewards but lose generalization. Our…

Artificial Intelligence · Computer Science 2025-11-07 Zeng Zhiyuan , Jiashuo Liu , Zhangyue Yin , Ge Zhang , Wenhao Huang , Xipeng Qiu

Late Stopping: Avoiding Confidently Learning from Mislabeled Examples

Sample selection is a prevalent method in learning with noisy labels, where small-loss data are typically considered as correctly labeled data. However, this method may not effectively identify clean hard examples with large losses, which…

Machine Learning · Computer Science 2023-08-29 Suqin Yuan , Lei Feng , Tongliang Liu

Pruning Pre-trained Language Models with Principled Importance and Self-regularization

Iterative pruning is one of the most effective compression methods for pre-trained language models. We discovered that finding the optimal pruning decision is an equality-constrained 0-1 Integer Linear Programming problem. The solution to…

Computation and Language · Computer Science 2023-05-23 Siyu Ren , Kenny Q. Zhu

Early stopping and non-parametric regression: An optimal data-dependent stopping rule

The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy…

Machine Learning · Statistics 2013-06-18 Garvesh Raskutti , Martin J. Wainwright , Bin Yu