Related papers: Kernel Methods and Multi-layer Perceptrons Learn L…
Modern deep learning models employ considerably more parameters than required to fit the training data. Whereas conventional statistical wisdom suggests such models should drastically overfit, in practice these models generalize remarkably…
The double descent phenomenon challenges traditional statistical learning theory by revealing scenarios where larger models do not necessarily lead to reduced performance on unseen data. While this counterintuitive behavior has been…
The computational complexity of kernel methods has often been a major barrier for applying them to large-scale learning problems. We argue that this barrier can be effectively overcome. In particular, we develop methods to scale up kernel…
A recent series of theoretical works showed that the dynamics of neural networks with a certain initialisation are well-captured by kernel methods. Concurrent empirical work demonstrated that kernel methods can come close to the performance…
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that…
Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However,…
Linear mixed models (LMMs) are used extensively to model dependecies of observations in linear regression and are used extensively in many application areas. Parameter estimation for LMMs can be computationally prohibitive on big data.…
Understanding the dynamics of neural networks in different width regimes is crucial for improving their training and performance. We present an exact solution for the learning dynamics of a one-hidden-layer linear network, with…
In this paper, the flexibility, versatility and predictive power of kernel regression are combined with now lavishly available network data to create regression models with even greater predictive performances. Building from previous work…
Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this…
Metric and kernel learning are important in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over low-dimensional data, while existing kernel learning algorithms are…
Deep learning models, such as wide neural networks, can be conceptualized as nonlinear dynamical physical systems characterized by a multitude of interacting degrees of freedom. Such systems in the infinite limit, tend to exhibit simplified…
Understanding how convolutional neural networks (CNNs) can efficiently learn high-dimensional functions remains a fundamental challenge. A popular belief is that these models harness the local and hierarchical structure of natural data such…
Research in modern data-driven dynamical systems is typically focused on the three key challenges of high dimensionality, unknown dynamics, and nonlinearity. The dynamic mode decomposition (DMD) has emerged as a cornerstone for modeling…
Significant theoretical work has established that in specific regimes, neural networks trained by gradient descent behave like kernel methods. However, in practice, it is known that neural networks strongly outperform their associated…
The use of kernels for nonlinear prediction is widespread in machine learning. They have been popularized in support vector machines and used in kernel ridge regression, amongst others. Kernel methods share three aspects. First, instead of…
We introduce a data-driven order reduction method for nonlinear control systems, drawing on recent progress in machine learning and statistical dimensionality reduction. The method rests on the assumption that the nonlinear system behaves…
A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general…
In this paper, we study the feature learning ability of two-layer neural networks in the mean-field regime through the lens of kernel methods. To focus on the dynamics of the kernel induced by the first layer, we utilize a two-timescale…
This paper studies kernel ridge regression in high dimensions under covariate shifts and analyzes the role of importance re-weighting. We first derive the asymptotic expansion of high dimensional kernels under covariate shifts. By a…