English
Related papers

Related papers: A Regularization-Sharpness Tradeoff for Linear Int…

200 papers

Many common estimators in machine learning and causal inference are linear smoothers, where the prediction is a weighted average of the training outcomes. Some estimators, such as ordinary least squares and kernel ridge regression, allow…

Machine Learning · Computer Science 2026-04-02 David Arbour , Harsh Parikh , Bijan Niknam , Elizabeth Stuart , Kara Rudolph , Avi Feller

In deep learning, often the training process finds an interpolator (a solution with 0 training loss), but the test loss is still low. This phenomenon, known as benign overfitting, is a major mystery that received a lot of recent attention.…

Machine Learning · Computer Science 2023-05-29 Mo Zhou , Rong Ge

Motivated by surprisingly good generalization properties of learned deep neural networks in overparameterized scenarios and by the related double descent phenomenon, this paper analyzes the relation between smoothness and low generalization…

Machine Learning · Computer Science 2021-10-29 Yuege Xie , Hung-Hsu Chou , Holger Rauhut , Rachel Ward

We study the implicit regularization of optimization methods for linear models interpolating the training data in the under-parametrized and over-parametrized regimes. Since it is difficult to determine whether an optimizer converges to…

Machine Learning · Computer Science 2022-07-12 Sharan Vaswani , Reza Babanezhad , Jose Gallego-Posada , Aaron Mishkin , Simon Lacoste-Julien , Nicolas Le Roux

We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple…

Machine Learning · Statistics 2022-06-17 Chen Cheng , John Duchi , Rohith Kuditipudi

In this work we establish an algorithm and distribution independent non-asymptotic trade-off between the model size, excess test loss, and training loss of linear predictors. Specifically, we show that models that perform well on the test…

Machine Learning · Statistics 2023-04-20 Nikhil Ghosh , Mikhail Belkin

Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new…

Statistics Theory · Mathematics 2022-06-07 Zhiqi Bu , Jason Klusowski , Cynthia Rush , Weijie J. Su

Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined…

Statistics Theory · Mathematics 2022-02-15 Peng Zhao , Yun Yang , Qiao-Chu He

The bias-variance trade-off is a central concept in supervised learning. In classical statistics, increasing the complexity of a model (e.g., number of parameters) reduces bias but also increases variance. Until recently, it was commonly…

Machine Learning · Statistics 2022-03-25 Jason W. Rocks , Pankaj Mehta

Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between regularising parameters' norm and obtained estimators remains theoretically misunderstood. For…

Machine Learning · Statistics 2025-04-09 Etienne Boursier , Nicolas Flammarion

Modern machine learning models are often trained in a setting where the number of parameters exceeds the number of training samples. To understand the implicit bias of gradient descent in such overparameterized models, prior work has…

Machine Learning · Statistics 2025-10-29 Hannes Matt , Dominik Stöger

Within the statistical and machine learning literature, regularization techniques are often used to construct sparse (predictive) models. Most regularization strategies only work for data where all predictors are treated identically, such…

Computation · Statistics 2020-12-16 Sander Devriendt , Katrien Antonio , Tom Reynkens , Roel Verbelen

Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield…

Machine Learning · Statistics 2026-01-28 Julia Nakhleh , Robert D. Nowak

A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this…

Machine Learning · Computer Science 2025-12-19 Maria Matveev , Vit Fojtik , Hung-Hsu Chou , Gitta Kutyniok , Johannes Maly

The Ridgeless minimum $\ell_2$-norm interpolator in overparametrized linear regression has attracted considerable attention in recent years in both machine learning and statistics communities. While it seems to defy conventional wisdom that…

Statistics Theory · Mathematics 2026-01-21 Qiyang Han , Xiaocong Xu

High-dimensional predictive models, those with more measurements than observations, require regularization to be well defined, perform well empirically, and possess theoretical guarantees. The amount of regularization, often determined by…

Methodology · Statistics 2019-07-16 Darren Homrighausen , Daniel J. McDonald

State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it…

Machine Learning · Statistics 2023-10-18 Antônio H. Ribeiro , Dave Zachariah , Francis Bach , Thomas B. Schön

The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit,…

Machine Learning · Statistics 2026-01-13 Liam Hodgkinson , Chris van der Heide , Robert Salomone , Fred Roosta , Michael W. Mahoney

Deep models, while being extremely versatile and accurate, are vulnerable to adversarial attacks: slight perturbations that are imperceptible to humans can completely flip the prediction of deep models. Many attack and defense mechanisms…

Machine Learning · Computer Science 2019-07-30 Kaiwen Wu , Yaoliang Yu

This paper presents a bias-variance tradeoff of graph Laplacian regularizer, which is widely used in graph signal processing and semi-supervised learning tasks. The scaling law of the optimal regularization parameter is specified in terms…

Machine Learning · Statistics 2017-08-02 Pin-Yu Chen , Sijia Liu
‹ Prev 1 2 3 10 Next ›