English
Related papers

Related papers: Speed Limits for Deep Learning

200 papers

Catastrophic forgetting is one of the fundamental issues of continual learning because neural networks forget the tasks learned previously when trained on new tasks. The proposed framework is a new path-coordinated framework of continual…

Machine Learning · Computer Science 2025-11-05 Rathin Chandra Shit

The Neural Tangent Kernel (NTK) framework explains optimization in over-parameterized neural networks via approximately linearized dynamics, yielding exponential convergence guarantees. However, existing results are often overly pessimistic…

Machine Learning · Computer Science 2026-05-26 Ruchirinkil Marreddy , Chaoyue Liu

Spiking Neural Networks (SNNs) offer a promising energy-efficient alternative to Artificial Neural Networks (ANNs) by utilizing sparse and asynchronous processing through discrete spike-based computation. However, the performance of deep…

Neural and Evolutionary Computing · Computer Science 2025-10-10 Eric Jahns , Davi Moreno , Michel A. Kinsy

Spiking neural networks (SNN) are able to learn spatiotemporal features while using less energy, especially on neuromorphic hardware. The most widely used spiking neuron in deep learning is the Leaky Integrate and Fire (LIF) neuron. LIF…

Neural and Evolutionary Computing · Computer Science 2023-08-08 Sidi Yaya Arnaud Yarga , Sean U. N. Wood

Graph Neural Networks (GNNs) have been studied through the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the…

Machine Learning · Computer Science 2021-05-27 Keyulu Xu , Mozhi Zhang , Stefanie Jegelka , Kenji Kawaguchi

Spiking neural network (SNN) is interesting both theoretically and practically because of its strong bio-inspiration nature and potentially outstanding energy efficiency. Unfortunately, its development has fallen far behind the conventional…

Computer Vision and Pattern Recognition · Computer Science 2021-09-20 Shibo Zhou , Xiaohua LI , Ying Chen , Sanjeev T. Chandrasekaran , Arindam Sanyal

Minimizing non-convex and high-dimensional objective functions is challenging, especially when training modern deep neural networks. In this paper, a novel approach is proposed which divides the training process into two consecutive phases…

Machine Learning · Computer Science 2017-10-11 Nanyang Ye , Zhanxing Zhu , Rafal K. Mantiuk

Recent research has been focused on two different approaches to studying neural networks training in the limit of infinite width (1) a mean-field (MF) and (2) a constant neural tangent kernel (NTK) approximations. These two approaches have…

Machine Learning · Computer Science 2020-10-23 Eugene A. Golikov

Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can…

Machine Learning · Computer Science 2020-11-06 Hengyue Liu , Samyak Parajuli , Jesse Hostetler , Sek Chai , Bir Bhanu

Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large…

Machine Learning · Computer Science 2022-12-20 Jean-Roch Vlimant , Junqi Yin

Understanding how deep neural networks learn remains a fundamental challenge in modern machine learning. A growing body of evidence suggests that training dynamics undergo a distinct phase transition, yet our understanding of this…

Machine Learning · Computer Science 2025-05-21 Zhanpeng Zhou , Yongyi Yang , Mahito Sugiyama , Junchi Yan

Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap between the theory and practice of deep…

Neural and Evolutionary Computing · Computer Science 2014-02-20 Andrew M. Saxe , James L. McClelland , Surya Ganguli

Deep Neural Networks (DNNs) which are trained end-to-end have been successfully applied to solve complex problems that we have not been able to solve in past decades. Autonomous driving is one of the most complex problems which is yet to be…

A rising trend in theoretical deep learning is to understand why deep learning works through Neural Tangent Kernel (NTK) [jgh18], a kernel method that is equivalent to using gradient descent to train a multi-layer infinitely-wide neural…

Machine Learning · Computer Science 2023-09-15 Lianke Qin , Zhao Song , Baocheng Sun

For certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization, but for the networks used in practice, the empirical NTK only provides a rough first-order approximation. Still, a…

Machine Learning · Computer Science 2021-10-14 Guillermo Ortiz-Jiménez , Seyed-Mohsen Moosavi-Dezfooli , Pascal Frossard

In this paper, we describe a phenomenon, which we named "super-convergence", where neural networks can be trained an order of magnitude faster than with standard training methods. The existence of super-convergence is relevant to…

Machine Learning · Computer Science 2018-05-18 Leslie N. Smith , Nicholay Topin

Deep neural networks (NNs) encounter scalability limitations when confronted with a vast array of neurons, thereby constraining their achievable network depth. To address this challenge, we propose an integration of tensor networks (TN)…

Disordered Systems and Neural Networks · Physics 2024-08-20 Saeed S. Jahromi , Roman Orus

Deep Neural Networks (DNNs) have gained immense success in cognitive applications and greatly pushed today's artificial intelligence forward. The biggest challenge in executing DNNs is their extremely data-extensive computations. The…

Computer Vision and Pattern Recognition · Computer Science 2019-09-10 Fuqiang Liu , C. Liu

A recent breakthrough in deep learning theory shows that the training of over-parameterized deep neural networks can be characterized by a kernel function called \textit{neural tangent kernel} (NTK). However, it is known that this type of…

Machine Learning · Computer Science 2020-10-07 Zixiang Chen , Yuan Cao , Quanquan Gu , Tong Zhang

Spiking Neural Networks (SNNs) have emerged as a promising alternative to traditional Deep Neural Networks for low-power computing. However, the effectiveness of SNNs is not solely determined by their performance but also by their energy…

Neural and Evolutionary Computing · Computer Science 2023-05-19 Florian Bacho , Dominique Chu