English
Related papers

Related papers: On the interplay between data structure and loss f…

200 papers

Overparametrized models can exhibit an excellent generalization performance, although they should be prone to overfitting according to classical statistical theory. The discovery of the "double descent", indicating that the generalization…

Machine Learning · Computer Science 2026-05-22 Tino Werner

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden…

Statistics Theory · Mathematics 2022-03-28 Federica Gerace , Bruno Loureiro , Florent Krzakala , Marc Mézard , Lenka Zdeborová

Although overparameterized models have achieved remarkable practical success, their theoretical properties, particularly their generalization behavior, remain incompletely understood. The well known double descents phenomenon suggests that…

Machine Learning · Statistics 2026-01-06 Haoran Zhan , Yingcun Xia

The success of deep neural networks largely depends on the statistical structure of the training data. While learning dynamics and generalization on isotropic data are well-established, the impact of pronounced anisotropy on these crucial…

Machine Learning · Statistics 2026-01-13 Taishi Watanabe , Ryo Karakida , Jun-nosuke Teramae

The generalization performance of a machine learning algorithm such as a neural network depends in a non-trivial way on the structure of the data distribution. To analyze the influence of data structure on test loss dynamics, we study an…

Machine Learning · Statistics 2022-03-16 Blake Bordelon , Cengiz Pehlevan

Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of…

In this expository note we describe a surprising phenomenon in overparameterized linear regression, where the dimension exceeds the number of samples: there is a regime where the test risk of the estimator found by gradient descent…

Machine Learning · Statistics 2019-12-17 Preetum Nakkiran

Double descent is a surprising phenomenon in machine learning, in which as the number of model parameters grows relative to the number of data, test error drops as models grow ever larger into the highly overparameterized (data…

Successful deep learning models often involve training neural network architectures that contain more parameters than the number of training samples. Such overparametrized models have been extensively studied in recent years, and the…

Machine Learning · Computer Science 2024-02-02 Hamed Hassani , Adel Javanmard

We study the linear subspace fitting problem in the overparameterized setting, where the estimated subspace can perfectly interpolate the training examples. Our scope includes the least-squares solutions to subspace fitting tasks with…

Machine Learning · Computer Science 2020-08-21 Yehuda Dar , Paul Mayer , Lorenzo Luzi , Richard G. Baraniuk

In recent years, significant attention in deep learning theory has been devoted to analyzing when models that interpolate their training data can still generalize well to unseen examples. Many insights have been gained from studying models…

Machine Learning · Statistics 2023-10-24 Jacob A. Zavatone-Veth , Cengiz Pehlevan

We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer…

Machine Learning · Computer Science 2022-09-29 Yehuda Dar , Richard G. Baraniuk

We study the generalization behavior of transfer learning of deep neural networks (DNNs). We adopt the overparameterization perspective -- featuring interpolation of the training data (i.e., approximately zero train error) and the double…

Machine Learning · Computer Science 2023-06-13 Yehuda Dar , Lorenzo Luzi , Richard G. Baraniuk

We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all training points are support vectors: solutions obtained by…

Machine Learning · Computer Science 2021-10-15 Vidya Muthukumar , Adhyyan Narang , Vignesh Subramanian , Mikhail Belkin , Daniel Hsu , Anant Sahai

The loss function is arguably among the most important hyperparameters for a neural network. Many loss functions have been designed to date, making a correct choice nontrivial. However, elaborate justifications regarding the choice of the…

Machine Learning · Computer Science 2022-10-31 Simon Dräger , Jannik Dunkelau

Deep neural networks can achieve remarkable generalization performances while interpolating the training data perfectly. Rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a "double descent" -…

Machine Learning · Computer Science 2020-04-06 Stéphane d'Ascoli , Maria Refinetti , Giulio Biroli , Florent Krzakala

Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive…

Machine Learning · Computer Science 2024-03-13 Soo Min Kwon , Zekai Zhang , Dogyoon Song , Laura Balzano , Qing Qu

Autonomous machine learning systems that learn many tasks in sequence are prone to the catastrophic forgetting problem. Mathematical theory is needed in order to understand the extent of forgetting during continual learning. As a…

Machine Learning · Computer Science 2025-02-18 Daniel Goldfarb , Paul Hand

Deep networks are typically trained with many more parameters than the size of the training dataset. Recent empirical evidence indicates that the practice of overparameterization not only benefits training large models, but also assists -…

Machine Learning · Computer Science 2020-12-17 Xiangyu Chang , Yingcong Li , Samet Oymak , Christos Thrampoulidis

Conventional statistical wisdom established a well-understood relationship between model complexity and prediction error, typically presented as a U-shaped curve reflecting a transition between under- and overfitting regimes. However,…

Machine Learning · Statistics 2023-10-31 Alicia Curth , Alan Jeffares , Mihaela van der Schaar
‹ Prev 1 2 3 10 Next ›