English
Related papers

Related papers: Understanding Implicit Regularization in Over-Para…

200 papers

Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined…

Statistics Theory · Mathematics 2022-02-15 Peng Zhao , Yun Yang , Qiao-Chu He

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error. In such cases, the choice of the optimization algorithm and its respective hyper-parameters introduces…

Machine Learning · Computer Science 2019-12-06 Gauthier Gidel , Francis Bach , Simon Lacoste-Julien

In this paper, we design a regularization-free algorithm for high-dimensional support vector machines (SVMs) by integrating over-parameterization with Nesterov's smoothing method, and provide theoretical guarantees for the induced implicit…

Statistics Theory · Mathematics 2023-10-27 Yang Sui , Xin He , Yang Bai

We consider whether algorithmic choices in over-parameterized linear matrix factorization introduce implicit regularization. We focus on noiseless matrix sensing over rank-$r$ positive semi-definite (PSD) matrices in $\mathbb{R}^{n \times…

Machine Learning · Statistics 2019-09-16 Kelly Geyer , Anastasios Kyrillidis , Amir Kalev

Recently, there has been significant progress in understanding the convergence and generalization properties of gradient-based methods for training overparameterized learning models. However, many aspects including the role of small random…

Machine Learning · Computer Science 2023-07-04 Mahdi Soltanolkotabi , Dominik Stöger , Changzhi Xie

We investigate implicit regularization schemes for gradient descent methods applied to unpenalized least squares regression to solve the problem of reconstructing a sparse signal from an underdetermined system of linear measurements under…

Machine Learning · Statistics 2019-09-12 Tomas Vaškevičius , Varun Kanade , Patrick Rebeschini

Modern machine learning models are often trained in a setting where the number of parameters exceeds the number of training samples. To understand the implicit bias of gradient descent in such overparameterized models, prior work has…

Machine Learning · Statistics 2025-10-29 Hannes Matt , Dominik Stöger

We study the asymmetric matrix factorization problem under a natural nonconvex formulation with arbitrary overparametrization. The model-free setting is considered, with minimal assumption on the rank or singular values of the observed…

Machine Learning · Computer Science 2023-08-22 Liwei Jiang , Yudong Chen , Lijun Ding

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations…

Machine Learning · Statistics 2026-03-19 Max Schölpple , Liu Fanghui , Ingo Steinwart

We consider networks, trained via stochastic gradient descent to minimize $\ell_2$ loss, with the training labels perturbed by independent noise at each iteration. We characterize the behavior of the training dynamics near any parameter…

Machine Learning · Computer Science 2020-07-23 Guy Blanc , Neha Gupta , Gregory Valiant , Paul Valiant

Over-parameterized neural networks generalize well in practice without any explicit regularization. Although it has not been proven yet, empirical evidence suggests that implicit regularization plays a crucial role in deep learning and…

Machine Learning · Computer Science 2019-03-07 Masayoshi Kubo , Ryotaro Banno , Hidetaka Manabe , Masataka Minoji

Recovering jointly sparse signals in the multiple measurement vectors (MMV) setting is a fundamental problem in machine learning, but traditional methods often require careful parameter tuning or prior knowledge of the sparsity of the…

Machine Learning · Computer Science 2026-02-02 Lakshmi Jayalal , Sheetal Kalyani

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of $X$. We conjecture and provide empirical and theoretical evidence that with small enough…

Machine Learning · Statistics 2017-05-26 Suriya Gunasekar , Blake Woodworth , Srinadh Bhojanapalli , Behnam Neyshabur , Nathan Srebro

Gradient descent can be surprisingly good at optimizing deep neural networks without overfitting and without explicit regularization. We find that the discrete steps of gradient descent implicitly regularize models by penalizing gradient…

Machine Learning · Computer Science 2022-07-20 David G. T. Barrett , Benoit Dherin

Implicit regularization refers to the tendency of local search algorithms to converge to low-dimensional solutions, even when such structures are not explicitly enforced. Despite its ubiquity, the mechanism underlying this behavior remains…

Machine Learning · Computer Science 2025-12-10 Jianhao Ma , Geyu Liang , Salar Fattahi

We show that the gradient descent algorithm provides an implicit regularization effect in the learning of over-parameterized matrix factorization models and one-hidden-layer neural networks with quadratic activations. Concretely, we show…

Machine Learning · Computer Science 2019-02-15 Yuanzhi Li , Tengyu Ma , Hongyang Zhang

Over the past few years, trace regression models have received considerable attention in the context of matrix completion, quantum state tomography, and compressed sensing. Estimation of the underlying matrix from regularization-based…

Machine Learning · Statistics 2015-04-24 Martin Slawski , Ping Li , Matthias Hein

Regularized nonnegative low-rank approximations, such as sparse Nonnegative Matrix Factorization or sparse Nonnegative Tucker Decomposition, form an important branch of dimensionality reduction models known for their enhanced…

Machine Learning · Computer Science 2025-01-31 Jeremy E. Cohen , Valentin Leplat

Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Wei Hu , Yuping Luo

Parameter-free algorithms are online learning algorithms that do not require setting learning rates. They achieve optimal regret with respect to the distance between the initial point and any competitor. Yet, parameter-free algorithms do…

Machine Learning · Computer Science 2022-03-22 Keyi Chen , Ashok Cutkosky , Francesco Orabona
‹ Prev 1 2 3 10 Next ›