English
Related papers

Related papers: Speed Limits for Deep Learning

200 papers

Understanding the training dynamics of deep neural networks remains a major open problem, with physics-inspired approaches offering promising insights. Building on this perspective, we develop a thermodynamic framework to describe the…

Machine Learning · Computer Science 2026-05-15 Ildus Sadrtdinov , Ekaterina Lobacheva , Ivan Klimov , Mikhail Burtsev , Mikhail I. Katsnelson , Dmitry Vetrov

Neural Tangent Kernel (NTK) is widely used to analyze overparametrized neural networks due to the famous result by Jacot et al. (2018): in the infinite-width limit, the NTK is deterministic and constant during training. However, this result…

Machine Learning · Computer Science 2022-07-22 Mariia Seleznova , Gitta Kutyniok

Convolutional neural networks (CNNs) have been increasingly deployed to edge devices. Hence, many efforts have been made towards efficient CNN inference in resource-constrained platforms. This paper attempts to explore an orthogonal…

Machine Learning · Computer Science 2019-12-09 Yue Wang , Ziyu Jiang , Xiaohan Chen , Pengfei Xu , Yang Zhao , Yingyan Lin , Zhangyang Wang

Small generalization errors of over-parameterized neural networks (NNs) can be partially explained by the frequency biasing phenomenon, where gradient-based algorithms minimize the low-frequency misfit before reducing the high-frequency…

Machine Learning · Computer Science 2022-09-27 Annan Yu , Yunan Yang , Alex Townsend

The rapid growth of machine learning has spurred legislative initiatives such as ``the Right to be Forgotten,'' allowing users to request data removal. In response, ``machine unlearning'' proposes the selective removal of unwanted data…

Machine Learning · Computer Science 2023-12-25 Guihong Li , Hsiang Hsu , Chun-Fu Chen , Radu Marculescu

Deep neural networks have dramatically transformed machine learning, but their memory and energy demands are substantial. The requirements of real biological neural networks are rather modest in comparison, and one feature that might…

Machine Learning · Computer Science 2020-07-27 Tianlin Liu , Friedemann Zenke

Physics-informed neural networks (PINNs) have lately received great attention thanks to their flexibility in tackling a wide range of forward and inverse problems involving partial differential equations. However, despite their noticeable…

Machine Learning · Computer Science 2020-07-30 Sifan Wang , Xinling Yu , Paris Perdikaris

We analyze the convergence of the averaged stochastic gradient descent for overparameterized two-layer neural networks for regression problems. It was recently found that a neural tangent kernel (NTK) plays an important role in showing the…

Machine Learning · Statistics 2021-06-14 Atsushi Nitanda , Taiji Suzuki

It is known that the learning rate is the most important hyper-parameter to tune for training deep neural networks. This paper describes a new method for setting the learning rate, named cyclical learning rates, which practically eliminates…

Computer Vision and Pattern Recognition · Computer Science 2017-04-05 Leslie N. Smith

The ``Neural Tangent Kernel'' (NTK) (Jacot et al 2018), and its empirical variants have been proposed as a proxy to capture certain behaviors of real neural networks. In this work, we study NTKs through the lens of scaling laws, and…

Machine Learning · Computer Science 2022-06-22 Nikhil Vyas , Yamini Bansal , Preetum Nakkiran

The Neural Tangent Kernel (NTK) is an important milestone in the ongoing effort to build a theory for deep learning. Its prediction that sufficiently wide neural networks behave as kernel methods, or equivalently as random feature models,…

Machine Learning · Computer Science 2020-06-25 Maxim Samarin , Volker Roth , David Belius

The Neural Tangent Kernel (NTK), defined as $\Theta_\theta^f(x_1, x_2) = \left[\partial f(\theta, x_1)\big/\partial \theta\right] \left[\partial f(\theta, x_2)\big/\partial \theta\right]^T$ where $\left[\partial f(\theta,…

Machine Learning · Computer Science 2022-06-20 Roman Novak , Jascha Sohl-Dickstein , Samuel S. Schoenholz

In the last few years, spiking neural networks have been demonstrated to perform on par with regular convolutional neural networks. Several works have proposed methods to convert a pre-trained CNN to a Spiking CNN without a significant…

Neural and Evolutionary Computing · Computer Science 2020-05-05 Martino Sorbaro , Qian Liu , Massimo Bortone , Sadique Sheik

Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. Modern CNN-based algorithms obtain state-of-the-art performance in diverse image restoration problems. Furthermore, it has been…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Julián Tachella , Junqi Tang , Mike Davies

The Neural Tangent Kernel (NTK) viewpoint is widely employed to analyze the training dynamics of overparameterized Physics-Informed Neural Networks (PINNs). However, unlike the case of linear Partial Differential Equations (PDEs), we show…

Machine Learning · Computer Science 2024-11-01 Andrea Bonfanti , Giuseppe Bruno , Cristina Cipriani

We employ constraints to control the parameter space of deep neural networks throughout training. The use of customized, appropriately designed constraints can reduce the vanishing/exploding gradients problem, improve smoothness of…

Machine Learning · Computer Science 2021-06-22 Benedict Leimkuhler , Tiffany Vlaar , Timothée Pouchon , Amos Storkey

Recent research shows that for training with $\ell_2$ loss, convolutional neural networks (CNNs) whose width (number of channels in convolutional layers) goes to infinity correspond to regression with respect to the CNN Gaussian Process…

Machine Learning · Computer Science 2019-11-05 Zhiyuan Li , Ruosong Wang , Dingli Yu , Simon S. Du , Wei Hu , Ruslan Salakhutdinov , Sanjeev Arora

The study of Neural Tangent Kernels (NTKs) has provided much needed insight into convergence and generalization properties of neural networks in the over-parametrized (wide) limit by approximating the network using a first-order Taylor…

Machine Learning · Statistics 2023-02-02 Alistair Shilton , Sunil Gupta , Santu Rana , Svetha Venkatesh

The evolution of a deep neural network trained by the gradient descent can be described by its neural tangent kernel (NTK) as introduced in [20], where it was proven that in the infinite width limit the NTK converges to an explicit limiting…

Machine Learning · Computer Science 2019-09-19 Jiaoyang Huang , Horng-Tzer Yau

Despite the notable success of deep neural networks (DNNs) in solving complex tasks, the training process still remains considerable challenges. A primary obstacle is the substantial time required for training, particularly as high…

Machine Learning · Computer Science 2025-09-09 Viet Hoang Pham , Hyo-Sung Ahn