English
Related papers

Related papers: Speed Limits for Deep Learning

200 papers

Decentralized optimization with time-varying networks is an emerging paradigm in machine learning. It saves remarkable communication overhead in large-scale deep training and is more robust in wireless scenarios especially when nodes are…

Machine Learning · Computer Science 2022-11-02 Xinmeng Huang , Kun Yuan

Physics-informed Kolmogorov-Arnold Networks (PIKANs), and in particular their Chebyshev-based variants (cPIKANs), have recently emerged as promising models for solving partial differential equations (PDEs). However, their training dynamics…

Machine Learning · Computer Science 2025-06-10 Salah A. Faroughi , Farinaz Mostajeran

A convergence analysis is developed for the regularized Newton method for training neural networks (NNs) in the overparameterized limit. As the number of hidden units tends to infinity, the NN training dynamics converge in probability to…

Machine Learning · Computer Science 2026-05-21 Konstantin Riedl , Konstantinos Spiliopoulos , Justin Sirignano

Spiking neural networks (SNNs) have demonstrated excellent capabilities in various intelligent scenarios. Most existing methods for training SNNs are based on the concept of synaptic plasticity; however, learning in the realistic brain also…

Neural and Evolutionary Computing · Computer Science 2023-04-04 Hongze Sun , Wuque Cai , Baoxin Yang , Yan Cui , Yang Xia , Dezhong Yao , Daqing Guo

Neural networks enjoy widespread success in both research and industry and, with the imminent advent of quantum technology, it is now a crucial challenge to design quantum neural networks for fully quantum learning tasks. Here we propose…

We develop an approach to efficiently grow neural networks, within which parameterization and optimization strategies are designed by considering their effects on the training dynamics. Unlike existing growing methods, which follow simple…

Machine Learning · Computer Science 2023-06-23 Xin Yuan , Pedro Savarese , Michael Maire

Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in…

Computer Vision and Pattern Recognition · Computer Science 2016-10-13 Pedro Porto Buarque de Gusmão , Gianluca Francini , Skjalg Lepsøy , Enrico Magli

Transformers have become the dominant architecture in modern machine learning, yet the theoretical understanding of their training dynamics remains limited. This paper develops a rigorous mathematical framework for analyzing gradient-based…

Optimization and Control · Mathematics 2026-05-19 Raphaël Barboni , Maarten V. de Hoop , Takashi Furuya , Gabriel Peyré

In wireless communications, estimation of channels in OFDM systems spans frequency and time, which relies on sparse collections of pilot data, posing an ill-posed inverse problem. Moreover, deep learning estimators require large amounts of…

Machine Learning · Computer Science 2025-04-14 Mohammed Mallik , Guillaume Villemaud

The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of…

Machine Learning · Statistics 2021-04-29 Wesley J. Maddox , Shuai Tang , Pablo Garcia Moreno , Andrew Gordon Wilson , Andreas Damianou

Spiking neural networks have gained significant attention due to their brain-like information processing capabilities. The use of surrogate gradients has made it possible to train spiking neural networks with backpropagation, leading to…

Neural and Evolutionary Computing · Computer Science 2023-05-24 Dongcheng Zhao , Guobin Shen , Yiting Dong , Yang Li , Yi Zeng

Spiking neural networks (SNNs) are promising in a bio-plausible coding for spatio-temporal information and event-driven signal processing, which is very suited for energy-efficient implementation in neuromorphic hardware. However, the…

Neural and Evolutionary Computing · Computer Science 2020-12-21 Hanle Zheng , Yujie Wu , Lei Deng , Yifan Hu , Guoqi Li

Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very…

Machine Learning · Computer Science 2016-08-01 Gao Huang , Yu Sun , Zhuang Liu , Daniel Sedra , Kilian Weinberger

The high computational complexity and increasing parameter counts of deep neural networks pose significant challenges for deployment in resource-constrained environments, such as edge devices or real-time systems. To address this, we…

Machine Learning · Computer Science 2025-06-17 Laura Erb , Tommaso Boccato , Alexandru Vasilache , Juergen Becker , Nicola Toschi

The increasing complexity of deep learning architectures is resulting in training time requiring weeks or even months. This slow training is due in part to vanishing gradients, in which the gradients used by back-propagation are extremely…

Computer Vision and Pattern Recognition · Computer Science 2015-10-16 Bharat Singh , Soham De , Yangmuzi Zhang , Thomas Goldstein , Gavin Taylor

Deep spiking neural networks (SNNs) support asynchronous event-driven computation, massive parallelism and demonstrate great potential to improve the energy efficiency of its synchronous analog counterpart. However, insufficient attention…

Neural and Evolutionary Computing · Computer Science 2019-02-18 Jibin Wu , Yansong Chua , Malu Zhang , Qu Yang , Guoqi Li , Haizhou Li

In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true…

Machine Learning · Computer Science 2024-11-25 Jan Spörer , Bernhard Bermeitinger , Tomas Hrycej , Niklas Limacher , Siegfried Handschuh

Deep neural networks are machine learning tools that are transforming fields ranging from speech recognition to computational medicine. In this study, we extend their application to the field of alloy solidification modeling. To that end,…

Applied Physics · Physics 2019-12-23 M. Torabi Rad , A. Viardin , G. J. Schmitz , M. Apel

Spiking Neural Networks (SNNs) that operate in an event-driven manner and employ binary spike representation have recently emerged as promising candidates for energy-efficient computing. However, a cost bottleneck arises in obtaining…

Neural and Evolutionary Computing · Computer Science 2024-01-22 Yunpeng Yao , Man Wu , Zheng Chen , Renyuan Zhang

The neural tangent kernel (NTK) has garnered significant attention as a theoretical framework for describing the behavior of large-scale neural networks. Kernel methods are theoretically well-understood and as a result enjoy algorithmic…

Machine Learning · Computer Science 2024-05-30 Jonathan Wenger , Felix Dangel , Agustinus Kristiadi