Related papers: Notes on Deep Learning Theory
A recent breakthrough in deep learning theory shows that the training of over-parameterized deep neural networks can be characterized by a kernel function called \textit{neural tangent kernel} (NTK). However, it is known that this type of…
These notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning. They present a theory (developed by NC, NR and collaborators) of…
This lecture note is intended to prepare early-year master's and PhD students in data science or a related discipline with foundational ideas in machine learning. It starts with basic ideas in modern machine learning with classification as…
We discuss certain recent metric space methods and some of the possibilities these methods provide, with special focus on various generalizations of Lyapunov exponents originally appearing in the theory of dynamical systems and differential…
In this article, we review the literature on statistical theories of neural networks from three perspectives: approximation, training dynamics and generative models. In the first part, results on excess risks for neural networks are…
These notes were compiled as lecture notes for a course developed and taught at the University of the Southern California. They should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics.…
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized.…
The tremendous recent progress in analyzing the training dynamics of overparameterized neural networks has primarily focused on wide networks and therefore does not sufficiently address the role of depth in deep learning. In this work, we…
These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the…
Sequential training from task to task is becoming one of the major objects in deep learning applications such as continual learning and transfer learning. Nevertheless, it remains unclear under what conditions the trained model's…
In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight expansion of the network at…
These brief lecture notes cover the basics of neural networks and deep learning as well as their applications in the quantum domain, for physicists without prior knowledge. In the first part, we describe training using backpropagation,…
These lecture notes develop the theory of learning in deep and recurrent neuronal networks from the point of view of Bayesian inference. The aim is to enable the reader to understand typical computations found in the literature in this…
These are the lecture notes that accompanied the course of the same name that I taught at the Eindhoven University of Technology from 2021 to 2023. The course is intended as an introduction to neural networks for mathematics students at the…
We present a non-asymptotic theory of generalization in deep learning where the empirical neural tangent kernel partitions the output space. In directions corresponding to signal, error dissipates rapidly; in the vast orthogonal dimensions…
An in-depth exploration of object detection and semantic segmentation is provided, combining theoretical foundations with practical applications. State-of-the-art advancements in machine learning and deep learning are reviewed, focusing on…
Modern deep learning models employ considerably more parameters than required to fit the training data. Whereas conventional statistical wisdom suggests such models should drastically overfit, in practice these models generalize remarkably…
Among many unsolved puzzles in theories of Deep Neural Networks (DNNs), there are three most fundamental challenges that highly demand solutions, namely, expressibility, optimisability, and generalisability. Although there have been…
In this paper, we explore the application of mean field theory, a technique from statistical physics, to deep metric learning and address the high training complexity commonly associated with conventional metric learning loss functions. By…
The ability of learning useful features is one of the major advantages of neural networks. Although recent works show that neural network can operate in a neural tangent kernel (NTK) regime that does not allow feature learning, many works…