English
Related papers

Related papers: Speed Limits for Deep Learning

200 papers

As its width tends to infinity, a deep neural network's behavior under gradient descent can become simplified and predictable (e.g. given by the Neural Tangent Kernel (NTK)), if it is parametrized appropriately (e.g. the NTK…

Machine Learning · Computer Science 2022-07-18 Greg Yang , Edward J. Hu

A quantum neural network (QNN) is a parameterized mapping efficiently implementable on near-term Noisy Intermediate-Scale Quantum (NISQ) computers. It can be used for supervised learning when combined with classical gradient-based…

Quantum Physics · Physics 2023-03-28 Xuchen You , Shouvanik Chakrabarti , Boyang Chen , Xiaodi Wu

Graph Convolutional Networks (GCNs) have emerged as powerful tools for learning on network structured data. Although empirically successful, GCNs exhibit certain behaviour that has no rigorous explanation -- for instance, the performance of…

Machine Learning · Computer Science 2023-11-07 Mahalakshmi Sabanayagam , Pascal Esser , Debarghya Ghoshdastidar

Current Deep Learning approaches have been very successful using convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers. Three limitations of this approach are: 1) they are based on a simple…

Neural and Evolutionary Computing · Computer Science 2017-07-17 Thomas E. Potok , Catherine Schuman , Steven R. Young , Robert M. Patton , Federico Spedalieri , Jeremy Liu , Ke-Thia Yao , Garrett Rose , Gangotree Chakma

We analyze the generalization properties of two-layer neural networks in the neural tangent kernel (NTK) regime, trained with gradient descent (GD). For early stopped GD we derive fast rates of convergence that are known to be minimax…

Machine Learning · Statistics 2023-09-18 Mike Nguyen , Nicole Mücke

Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high…

Machine Learning · Computer Science 2020-03-11 Ronen Basri , Meirav Galun , Amnon Geifman , David Jacobs , Yoni Kasten , Shira Kritchman

The training dynamics and generalization properties of neural networks (NN) can be precisely characterized in function space via the neural tangent kernel (NTK). Structural changes to the NTK during training reflect feature learning and…

Machine Learning · Statistics 2022-02-11 Haozhe Shan , Blake Bordelon

A theory of neural networks (NNs) built upon collective variables would provide scientists with the tools to better understand the learning process at every stage. In this work, we introduce two such variables, the entropy and the trace of…

Machine Learning · Computer Science 2023-05-03 Samuel Tovey , Sven Krippendorf , Konstantin Nikolaou , Christian Holm

Scaling quantum computers requires tight integration of cryogenic control electronics with quantum processors, where Digital-to-Analog Converters (DACs) face severe power and area constraints. We investigate quantum neural network (QNN)…

The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key…

Machine Learning · Computer Science 2021-06-16 Sina Alemohammad , Zichao Wang , Randall Balestriero , Richard Baraniuk

Decentralized federated learning (DFL) is a collaborative machine learning framework for training a model across participants without a central server or raw data exchange. DFL faces challenges due to statistical heterogeneity, as…

Machine Learning · Computer Science 2025-06-16 Gabriel Thompson , Kai Yue , Chau-Wai Wong , Huaiyu Dai

This paper studies the infinite-width limit of deep linear neural networks initialized with random parameters. We obtain that, when the number of neurons diverges, the training dynamics converge (in a precise sense) to the dynamics obtained…

Machine Learning · Computer Science 2022-12-01 Lénaïc Chizat , Maria Colombo , Xavier Fernández-Real , Alessio Figalli

Modern neural networks are often regarded as complex black-box functions whose behavior is difficult to understand owing to their nonlinear dependence on the data and the nonconvexity in their loss landscapes. In this work, we show that…

Machine Learning · Computer Science 2020-06-26 Wei Hu , Lechao Xiao , Ben Adlam , Jeffrey Pennington

Neural Tangent Kernel (NTK) theory is widely used to study the dynamics of infinitely-wide deep neural networks (DNNs) under gradient descent. But do the results for infinitely-wide networks give us hints about the behavior of real…

Machine Learning · Computer Science 2022-02-02 Mariia Seleznova , Gitta Kutyniok

Knowing whether a Quantum Machine Learning model would perform well on a given dataset before training it can help to save critical resources. However, gathering a priori information about model performance (e.g., training speed, critical…

Quantum Physics · Physics 2025-03-05 Francesco Scala , Christa Zoufal , Dario Gerace , Francesco Tacchino

In wide neural networks, the Neural Tangent Kernel (NTK) remains approximately constant during training, providing a powerful theoretical tool for studying training dynamics, generalization, and connections to kernel methods. However, this…

Machine Learning · Computer Science 2026-05-26 Jonathan Plenk , Sergio Calvo-Ordonez , Alvaro Cartea , Yarin Gal , Mark van der Wilk , Kamil Ciosek

Despite their prevalence in deep-learning communities, over-parameterized models convey high demands of computational costs for proper training. This work studies the fine-grained, modular-level learning dynamics of over-parameterized…

We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data. Convergence at a linear…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Noah Golowich , Wei Hu

With the prosperity of mobile devices, the distributed learning approach enabling model training with decentralized data has attracted wide research. However, the lack of training capability for edge devices significantly limits the energy…

Machine Learning · Computer Science 2021-05-14 Ziyang Hong , C. Patrick Yue

Long training times of deep neural networks are a bottleneck in machine learning research. The major impediment to fast training is the quadratic growth of both memory and compute requirements of dense and convolutional layers with respect…

Machine Learning · Computer Science 2020-02-20 Mihailo Isakov , Michel A. Kinsy
‹ Prev 1 3 4 5 6 7 10 Next ›