Related papers: Statistically guided deep learning

On the rate of convergence of an over-parametrized deep neural network regression estimate learned by gradient descent

Nonparametric regression with random design is considered. The $L_2$ error with integration with respect to the design measure is used as the error criterion. An over-parametrized deep neural network regression estimate with logistic…

Statistics Theory · Mathematics 2025-04-07 Michael Kohler

Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

Estimation of a regression function from independent and identically distributed random variables is considered. The $L_2$ error with integration with respect to the design measure is used as an error criterion. Over-parametrized deep…

Statistics Theory · Mathematics 2022-10-05 Michael Kohler , Adam Krzyzak

Optimal Nonparametric Inference via Deep Neural Network

Deep neural network is a state-of-art method in modern science and technology. Much statistical literature have been devoted to understanding its performance in nonparametric estimation, whereas the results are suboptimal due to a redundant…

Machine Learning · Computer Science 2021-08-18 Ruiqi Liu , Ben Boukai , Zuofeng Shang

Measurement error models: from nonparametric methods to deep neural networks

The success of deep learning has inspired recent interests in applying neural networks in statistical inference. In this paper, we investigate the use of deep neural networks for nonparametric regression with measurement errors. We propose…

Machine Learning · Statistics 2020-07-16 Zhirui Hu , Zheng Tracy Ke , Jun S Liu

A supervised deep learning method for nonparametric density estimation

Nonparametric density estimation is an unsupervised learning problem. In this work we propose a two-step procedure that casts the density estimation problem in the first step into a supervised regression problem. The advantage is that we…

Statistics Theory · Mathematics 2024-06-04 Thijs Bos , Johannes Schmidt-Hieber

Convergence of Gradient Descent for Recurrent Neural Networks: A Nonasymptotic Analysis

We analyze recurrent neural networks with diagonal hidden-to-hidden weight matrices, trained with gradient descent in the supervised learning setting, and prove that gradient descent can achieve optimality \emph{without} massive…

Machine Learning · Computer Science 2024-10-11 Semih Cayci , Atilla Eryilmaz

Gradient Descent Finds Over-Parameterized Neural Networks with Sharp Generalization for Nonparametric Regression

We study nonparametric regression by an over-parameterized two-layer neural network trained by gradient descent (GD) in this paper. We show that, if the neural network is trained by GD with early stopping, then the trained network renders a…

Machine Learning · Statistics 2025-11-07 Yingzhen Yang , Ping Li

Robust Nonparametric Regression with Deep Neural Networks

In this paper, we study the properties of robust nonparametric estimation using deep neural networks for regression models with heavy tailed error distributions. We establish the non-asymptotic error bounds for a class of robust…

Statistics Theory · Mathematics 2021-07-23 Guohao Shen , Yuling Jiao , Yuanyuan Lin , Jian Huang

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which…

Statistics Theory · Mathematics 2022-08-31 Selina Drews , Michael Kohler

On the rate of convergence of a neural network regression estimate learned by gradient descent

Nonparametric regression with random design is considered. Estimates are defined by minimzing a penalized empirical $L_2$ risk over a suitably chosen class of neural networks with one hidden layer via gradient descent. Here, the gradient…

Statistics Theory · Mathematics 2019-12-10 Alina Braun , Michael Kohler , Harro Walk

Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few indirect measurements generated via a known acquisition procedure. In particular, neural networks perform well…

Machine Learning · Computer Science 2025-12-05 Hannah Laus , Suzanna Parkinson , Vasileios Charisopoulos , Felix Krahmer , Rebecca Willett

Stochastic Gradient Descent for Nonparametric Additive Regression

This paper introduces an iterative algorithm for training nonparametric additive models that enjoys favorable memory storage and computational requirements. The algorithm can be viewed as the functional counterpart of stochastic gradient…

Machine Learning · Statistics 2026-01-01 Xin Chen , Jason M. Klusowski

An Improved Analysis of Training Over-parameterized Deep Neural Networks

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for over-parameterized (i.e., sufficiently wide) deep neural networks. However, the…

Machine Learning · Computer Science 2019-06-12 Difan Zou , Quanquan Gu

Statistical learning by sparse deep neural networks

We consider a deep neural network estimator based on empirical risk minimization with l_1-regularization. We derive a general bound for its excess risk in regression and classification (including multiclass), and prove that it is adaptively…

Statistics Theory · Mathematics 2023-11-16 Felix Abramovich

Theory of Deep Learning III: explaining the non-overfitting puzzle

A main puzzle of deep networks revolves around the absence of overfitting despite large overparametrization and despite the large capacity demonstrated by zero training error on randomly labeled data. In this note, we show that the dynamics…

Machine Learning · Computer Science 2018-01-17 Tomaso Poggio , Kenji Kawaguchi , Qianli Liao , Brando Miranda , Lorenzo Rosasco , Xavier Boix , Jack Hidary , Hrushikesh Mhaskar

Analysis of the rate of convergence of neural network regression estimates which are easy to implement

Recent results in nonparametric regression show that for deep learning, i.e., for neural network estimates with many hidden layers, we are able to achieve good rates of convergence even in case of high-dimensional predictor variables,…

Statistics Theory · Mathematics 2019-12-12 Alina Braun , Michael Kohler , Adam Krzyzak

Analysis of the expected $L_2$ error of an over-parametrized deep neural network estimate learned by gradient descent without regularization

Recent results show that estimates defined by over-parametrized deep neural networks learned by applying gradient descent to a regularized empirical $L_2$ risk are universally consistent and achieve good rates of convergence. In this paper,…

Machine Learning · Statistics 2023-11-27 Selina Drews , Michael Kohler

Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation

In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce…

Statistics Theory · Mathematics 2020-03-04 Arnulf Jentzen , Timo Welti

Structural Sieves

This paper explores the use of deep neural networks for semiparametric estimation of economic models of maximizing behavior in production or discrete choice. We argue that certain deep networks are particularly well suited as a…

Econometrics · Economics 2022-04-06 Konrad Menzel

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li