Related papers: Self-Regularized Learning Methods

Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization

A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this…

Machine Learning · Computer Science 2025-12-19 Maria Matveev , Vit Fojtik , Hung-Hsu Chou , Gitta Kutyniok , Johannes Maly

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error. In such cases, the choice of the optimization algorithm and its respective hyper-parameters introduces…

Machine Learning · Computer Science 2019-12-06 Gauthier Gidel , Francis Bach , Simon Lacoste-Julien

Implicit Regularization in Deep Learning

In an attempt to better understand generalization in deep learning, we study several possible explanations. We show that implicit regularization induced by the optimization method is playing a key role in generalization and success of deep…

Machine Learning · Computer Science 2017-09-11 Behnam Neyshabur

Learning Acceleration Algorithms for Fast Parametric Convex Optimization with Certified Robustness

We develop a machine-learning framework to learn hyperparameter sequences for accelerated first-order methods (e.g., the step size and momentum sequences in accelerated gradient descent) to quickly solve parametric convex optimization…

Optimization and Control · Mathematics 2025-10-07 Rajiv Sambharya , Jinho Bok , Nikolai Matni , George Pappas

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may…

Machine Learning · Computer Science 2020-10-20 Noam Razin , Nadav Cohen

Implicit Regularization in Over-parameterized Neural Networks

Over-parameterized neural networks generalize well in practice without any explicit regularization. Although it has not been proven yet, empirical evidence suggests that implicit regularization plays a crucial role in deep learning and…

Machine Learning · Computer Science 2019-03-07 Masayoshi Kubo , Ryotaro Banno , Hidetaka Manabe , Masataka Minoji

Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study

The notion of implicit bias, or implicit regularization, has been suggested as a means to explain the surprising generalization ability of modern-days overparameterized learning algorithms. This notion refers to the tendency of the…

Machine Learning · Computer Science 2020-12-23 Assaf Dauber , Meir Feder , Tomer Koren , Roi Livni

Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent

We propose graph-dependent implicit regularisation strategies for distributed stochastic subgradient descent (Distributed SGD) for convex problems in multi-agent learning. Under the standard assumptions of convexity, Lipschitz continuity,…

Machine Learning · Computer Science 2018-09-20 Dominic Richards , Patrick Rebeschini

Implicit Gradient Regularization

Gradient descent can be surprisingly good at optimizing deep neural networks without overfitting and without explicit regularization. We find that the discrete steps of gradient descent implicitly regularize models by penalizing gradient…

Machine Learning · Computer Science 2022-07-20 David G. T. Barrett , Benoit Dherin

Learning Semidefinite Regularizers

Regularization techniques are widely employed in optimization-based approaches for solving ill-posed inverse problems in data analysis and scientific computing. These methods are based on augmenting the objective with a penalty function,…

Optimization and Control · Mathematics 2021-06-08 Yong Sheng Soh , Venkat Chandrasekaran

Regularization Techniques for Learning with Matrices

There is growing body of learning problems for which it is natural to organize the parameters into matrix, so as to appropriately regularize the parameters under some matrix norm (in order to impose some more sophisticated prior knowledge).…

Machine Learning · Computer Science 2010-10-19 Sham M. Kakade , Shai Shalev-Shwartz , Ambuj Tewari

Iterative Regularization for Learning with Convex Loss Functions

We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method. Unlike other regularization approaches, in iterative regularization no constraint…

Machine Learning · Statistics 2015-04-02 Junhong Lin , Lorenzo Rosasco , Ding-Xuan Zhou

On Learning the Optimal Regularization Parameter in Inverse Problems

Selecting the best regularization parameter in inverse problems is a classical and yet challenging problem. Recently, data-driven approaches have become popular to tackle this challenge. These approaches are appealing since they do require…

Statistics Theory · Mathematics 2025-10-22 Jonathan Chirinos Rodriguez , Ernesto De Vito , Cesare Molinari , Lorenzo Rosasco , Silvia Villa

Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent

We propose \textit{Meta-Regularization}, a novel approach for the adaptive choice of the learning rate in first-order gradient descent methods. Our approach modifies the objective function by adding a regularization term on the learning…

Machine Learning · Computer Science 2021-04-13 Guangzeng Xie , Hao Jin , Dachao Lin , Zhihua Zhang

From inexact optimization to learning via gradient concentration

Optimization in machine learning typically deals with the minimization of empirical objectives defined by training data. However, the ultimate goal of learning is to minimize the error on future data (test error), for which the training…

Machine Learning · Statistics 2021-11-08 Bernhard Stankewitz , Nicole Mücke , Lorenzo Rosasco

Self-Paced Learning: an Implicit Regularization Perspective

Self-paced learning (SPL) mimics the cognitive mechanism of humans and animals that gradually learns from easy to hard samples. One key issue in SPL is to obtain better weighting strategy that is determined by minimizer function. Existing…

Machine Learning · Computer Science 2016-09-20 Yanbo Fan , Ran He , Jian Liang , Bao-Gang Hu

Implicit Regularization in Deep Matrix Factorization

Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Wei Hu , Yuping Luo

A Statistical Theory of Regularization-Based Continual Learning

We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks, with emphasis on how different regularization terms affect the model performance. We first derive the convergence rate…

Machine Learning · Computer Science 2024-06-11 Xuyang Zhao , Huiyuan Wang , Weiran Huang , Wei Lin

Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously

Generalization is a central problem in Machine Learning. Most prediction methods require careful calibration of hyperparameters carried out on a hold-out \textit{validation} dataset to achieve generalization. The main goal of this paper is…

Machine Learning · Computer Science 2020-06-15 Karim Lounici , Katia Meziani , Benjamin Riu

Learning Regularizers: Learning Optimizers that can Regularize

Learned Optimizers (LOs), a type of Meta-learning, have gained traction due to their ability to be parameterized and trained for efficient optimization. Traditional gradient-based methods incorporate explicit regularization techniques such…

Machine Learning · Computer Science 2025-10-13 Suraj Kumar Sahoo , Narayanan C Krishnan