Related papers: Generalization Error in Deep Learning

Generalization Error of Generalized Linear Models in High Dimensions

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our…

Machine Learning · Computer Science 2020-05-04 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

Generalization in Deep Learning

This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the…

Machine Learning · Statistics 2023-08-29 Kenji Kawaguchi , Leslie Pack Kaelbling , Yoshua Bengio

Decoding Generalization from Memorization in Deep Neural Networks

Overparameterized deep networks that generalize well have been key to the dramatic success of deep learning in recent years. The reasons for their remarkable ability to generalize are not well understood yet. When class labels in the…

Machine Learning · Computer Science 2026-02-03 Simran Ketha , Venkatakrishnan Ramaswamy

Understanding deep learning requires rethinking generalization

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the…

Machine Learning · Computer Science 2017-02-28 Chiyuan Zhang , Samy Bengio , Moritz Hardt , Benjamin Recht , Oriol Vinyals

Uniform convergence may be unable to explain generalization in deep learning

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan , J. Zico Kolter

An Information-Theoretic View for Deep Learning

Deep learning has transformed computer vision, natural language processing, and speech recognition\cite{badrinarayanan2017segnet, dong2016image, ren2017faster, ji20133d}. However, two critical questions remain obscure: (1) why do deep…

Machine Learning · Statistics 2018-10-03 Jingwei Zhang , Tongliang Liu , Dacheng Tao

On Generalization and Regularization in Deep Learning

Why do large neural network generalize so well on complex tasks such as image classification or speech recognition? What exactly is the role regularization for them? These are arguably among the most important open questions in machine…

Machine Learning · Statistics 2017-04-10 Pirmin Lemberger

Deep Learning Works in Practice. But Does it Work in Theory?

Deep learning relies on a very specific kind of neural networks: those superposing several neural layers. In the last few years, deep learning achieved major breakthroughs in many tasks such as image analysis, speech recognition, natural…

Artificial Intelligence · Computer Science 2018-02-01 Lê Nguyên Hoang , Rachid Guerraoui

Meet You Halfway: Explaining Deep Learning Mysteries

Deep neural networks perform exceptionally well on various learning tasks with state-of-the-art results. While these models are highly expressive and achieve impressively accurate solutions with excellent generalization abilities, they are…

Machine Learning · Computer Science 2022-06-10 Oriel BenShmuel

The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence

Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become…

Neurons and Cognition · Quantitative Biology 2020-02-13 Terrence J. Sejnowski

Testing the Generalization Power of Neural Network Models Across NLI Benchmarks

Neural network models have been very successful in natural language inference, with the best models reaching 90% accuracy in some benchmarks. However, the success of these models turns out to be largely benchmark specific. We show that…

Computation and Language · Computer Science 2019-06-04 Aarne Talman , Stergios Chatzikyriakidis

A Selective Overview of Deep Learning

Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks…

Machine Learning · Statistics 2019-04-16 Jianqing Fan , Cong Ma , Yiqiao Zhong

Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

The accuracy of deep learning, i.e., deep neural networks, can be characterized by dividing the total error into three main types: approximation error, optimization error, and generalization error. Whereas there are some satisfactory…

Machine Learning · Statistics 2021-11-03 Pengzhan Jin , Lu Lu , Yifa Tang , George Em Karniadakis

Generalization Through the Lens of Learning Dynamics

A machine learning (ML) system must learn not only to match the output of a target function on a training set, but also to generalize to novel situations in order to yield accurate predictions at deployment. In most practical applications,…

Machine Learning · Computer Science 2022-12-13 Clare Lyle

On the Generalization Mystery in Deep Learning

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of…

Machine Learning · Computer Science 2022-06-07 Satrajit Chatterjee , Piotr Zielinski

Explaining generalization in deep learning: progress and fundamental limits

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the…

Machine Learning · Computer Science 2021-10-19 Vaishnavh Nagarajan

With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization

Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit…

Machine Learning · Computer Science 2022-01-31 James Wang , Cheng-Lin Yang

Distant generalization by feedforward neural networks

This paper discusses the notion of generalization of training samples over long distances in the input space of a feedforward neural network. Such a generalization might occur in various ways, that differ in how great the contribution of…

Neural and Evolutionary Computing · Computer Science 2007-06-13 Artur Rataj

Deep Learning Generalization, Extrapolation, and Over-parameterization

We study the generalization of over-parameterized deep networks (for image classification) in relation to the convex hull of their training sets. Despite their great success, generalization of deep networks is considered a mystery. These…

Machine Learning · Computer Science 2022-03-22 Roozbeh Yousefzadeh

Linear Regression with Distributed Learning: A Generalization Error Perspective

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…

Machine Learning · Statistics 2021-11-03 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén