Related papers: Studying Generalization Through Data Averaging

Linear Regression with Distributed Learning: A Generalization Error Perspective

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…

Machine Learning · Statistics 2021-11-03 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Understanding Generalization via Set Theory

Generalization is at the core of machine learning models. However, the definition of generalization is not entirely clear. We employ set theory to introduce the concepts of algorithms, hypotheses, and dataset generalization. We analyze the…

Machine Learning · Computer Science 2023-11-14 Shiqi Liu

Learning Curves for SGD on Structured Features

The generalization performance of a machine learning algorithm such as a neural network depends in a non-trivial way on the structure of the data distribution. To analyze the influence of data structure on test loss dynamics, we study an…

Machine Learning · Statistics 2022-03-16 Blake Bordelon , Cengiz Pehlevan

Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training

As deep neural networks are highly expressive, it is important to find solutions with small generalization gap (the difference between the performance on the training data and unseen data). Focusing on the stochastic nature of training, we…

Machine Learning · Computer Science 2023-10-31 Rie Johnson , Tong Zhang

Biased Generalization in Diffusion Models

Generalization in generative modeling is defined as the ability to learn an underlying distribution from a finite dataset and produce novel samples, with evaluation largely driven by held-out performance and perceived sample quality. In…

Machine Learning · Computer Science 2026-03-05 Jerome Garnier-Brun , Luca Biggio , Davide Beltrame , Marc Mézard , Luca Saglietti

Train longer, generalize better: closing the generalization gap in large batch training of neural networks

Background: Deep learning models are typically trained using stochastic gradient descent or one of its variants. These methods update the weights using their gradient, estimated from a small fraction of the training data. It has been…

Machine Learning · Statistics 2018-01-03 Elad Hoffer , Itay Hubara , Daniel Soudry

Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization

Domain generalization is the problem of machine learning when the training data and the test data come from different data domains. We present a simple theoretical model of learning to generalize across domains in which there is a…

Machine Learning · Computer Science 2020-02-14 Vikas K. Garg , Adam Kalai , Katrina Ligett , Zhiwei Steven Wu

Assessing Generalization of SGD via Disagreement

We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate…

Machine Learning · Computer Science 2022-05-17 Yiding Jiang , Vaishnavh Nagarajan , Christina Baek , J. Zico Kolter

Generalization Error for Linear Regression under Distributed Learning

Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well…

Machine Learning · Statistics 2020-05-05 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

The Calibration Generalization Gap

Calibration is a fundamental property of a good predictive model: it requires that the model predicts correctly in proportion to its confidence. Modern neural networks, however, provide no strong guarantees on their calibration -- and can…

Machine Learning · Computer Science 2022-10-07 A. Michael Carrell , Neil Mallinar , James Lucas , Preetum Nakkiran

Distributional Generalization: A New Kind of Generalization

We introduce a new notion of generalization -- Distributional Generalization -- which roughly states that outputs of a classifier at train and test time are close *as distributions*, as opposed to close in just their average error. For…

Machine Learning · Computer Science 2020-10-16 Preetum Nakkiran , Yamini Bansal

Modeling Generalization in Machine Learning: A Methodological and Computational Study

As machine learning becomes more and more available to the general public, theoretical questions are turning into pressing practical issues. Possibly, one of the most relevant concerns is the assessment of our confidence in trusting machine…

Machine Learning · Computer Science 2020-06-30 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

Out-of-Distribution Generalization in Kernel Regression

In real word applications, data generating process for training a machine learning model often differs from what the model encounters in the test stage. Understanding how and whether machine learning models generalize under such…

Machine Learning · Statistics 2022-02-08 Abdulkadir Canatar , Blake Bordelon , Cengiz Pehlevan

Predicting the Generalization Gap in Deep Networks with Margin Distributions

As shown in recent research, deep neural networks can perfectly fit randomly labeled data, but with very poor accuracy on held out data. This phenomenon indicates that loss functions such as cross-entropy are not a reliable indicator of…

Machine Learning · Statistics 2019-06-13 Yiding Jiang , Dilip Krishnan , Hossein Mobahi , Samy Bengio

Revisiting Generalization Measures Beyond IID: An Empirical Study under Distributional Shift

Generalization remains a central yet unresolved challenge in deep learning, particularly the ability to predict a model's performance beyond its training distribution using quantities available prior to test-time evaluation. Building on the…

Machine Learning · Computer Science 2026-02-03 Sora Nakai , Youssef Fadhloun , Kacem Mathlouthi , Kotaro Yoshida , Ganesh Talluri , Ioannis Mitliagkas , Hiroki Naganuma

Understanding Why Neural Networks Generalize Well Through GSNR of Parameters

As deep neural networks (DNNs) achieve tremendous success across many application domains, researchers tried to explore in many aspects on why they generalize well. In this paper, we provide a novel perspective on these issues using the…

Machine Learning · Computer Science 2020-02-25 Jinlong Liu , Guoqing Jiang , Yunzhi Bai , Ting Chen , Huayan Wang

Generalization Error of Generalized Linear Models in High Dimensions

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…

Machine Learning · Computer Science 2020-05-04 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

How much data is sufficient to learn high-performing algorithms? Generalization guarantees for data-driven algorithm design

Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made…

Machine Learning · Computer Science 2021-04-27 Maria-Florina Balcan , Dan DeBlasio , Travis Dick , Carl Kingsford , Tuomas Sandholm , Ellen Vitercik

SGD Implicitly Regularizes Generalization Error

We derive a simple and model-independent formula for the change in the generalization gap due to a gradient descent update. We then compare the change in the test error for stochastic gradient descent to the change in test error from an…

Machine Learning · Computer Science 2021-04-13 Daniel A. Roberts

Separating Geometry from Probability in the Analysis of Generalization

The goal of machine learning is to find models that minimize prediction error on data that has not yet been seen. Its operational paradigm assumes access to a dataset $S$ and articulates a scheme for evaluating how well a given model…

Machine Learning · Computer Science 2026-04-22 Maxim Raginsky , Benjamin Recht