Related papers: Gradient Kernel Regression
Deep learning's successes are often attributed to its ability to automatically discover new representations of the data, rather than relying on handcrafted features like other learning methods. We show, however, that deep networks learned…
Deep kernel learning refers to a Gaussian process that incorporates neural networks to improve the modelling of complex functions. We present a method that makes this approach feasible for problems where the data consists of line integral…
Neural tangent kernels (NTKs) have been proposed to study the behavior of trained neural networks from the perspective of Gaussian processes. An important result in this body of work is the theorem of equivalence between a trained neural…
A recent breakthrough in deep learning theory shows that the training of over-parameterized deep neural networks can be characterized by a kernel function called \textit{neural tangent kernel} (NTK). However, it is known that this type of…
We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error…
We propose a new way of training neural networks, with the goal of reducing training cost. Our method uses approximate predicted gradients instead of the full gradients that require an expensive backward pass. We derive a…
The primary hyperparameter in kernel regression (KR) is the choice of kernel. In most theoretical studies of KR, one assumes the kernel is fixed before seeing the training data. Under this assumption, it is known that the optimal kernel is…
State-of-the-art neural networks are heavily over-parameterized, making the optimization algorithm a crucial ingredient for learning predictive models with good generalization properties. A recent line of work has shown that in a certain…
The development of methods to guide the design of neural networks is an important open challenge for deep learning theory. As a paradigm for principled neural architecture design, we propose the translation of high-performing kernels, which…
The Neural Tangent Kernel (NTK) has recently attracted intense study, as it describes the evolution of an over-parameterized Neural Network (NN) trained by gradient descent. However, it is now well-known that gradient descent is not always…
Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their…
Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning…
The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network. Despite the impracticality of this limit, the neural tangent kernel has allowed for a more direct study of neural…
The Neural Tangent Kernel (NTK) is the wide-network limit of a kernel defined using neural networks at initialization, whose embedding is the gradient of the output of the network with respect to its parameters. We study the "after kernel",…
A rising trend in theoretical deep learning is to understand why deep learning works through Neural Tangent Kernel (NTK) [jgh18], a kernel method that is equivalent to using gradient descent to train a multi-layer infinitely-wide neural…
In this article, we introduce a kernel-based consensual aggregation method for regression problems. We aim to flexibly combine individual regression estimators $r_1, r_2, \ldots, r_M$ using a weighted average where the weights are defined…
The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key…
We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling GANs' training scheme, are subject to ill-defined…
Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in connecting infinite-wide neural networks and Gaussian…
We explore the equivalence between neural networks and kernel methods by deriving the first exact representation of any finite-size parametric classification model trained with gradient descent as a kernel machine. We compare our exact…