Related papers: Automated Spectral Kernel Learning
We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples using theoretical methods from Gaussian processes and statistical physics. Our expressions apply to…
In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers. Our framework approximates the Transformer kernel as a dot product between spectral feature maps…
The asymptotically precise estimation of the generalization of kernel methods has recently received attention due to the parallels between neural networks and their associated kernels. However, prior works derive such estimates for training…
Recently, non-stationary spectral kernels have drawn much attention, owing to its powerful feature representation ability in revealing long-range correlations and input-dependent characteristics. However, non-stationary spectral kernels are…
This paper introduces a new and effective algorithm for learning kernels in a Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here, our approach can be easily applied to standard single task learning, as well. As…
Learning with kernels is an important concept in machine learning. Standard approaches for kernel methods often use predefined kernels that require careful selection of hyperparameters. To mitigate this burden, we propose in this paper a…
We introduce a novel kernel-based framework for learning differential equations and their solution maps that is efficient in data requirements, in terms of solution examples and amount of measurements from each example, and computational…
Distributed learning is an effective way to analyze big data. In distributed regression, a typical approach is to divide the big data into multiple blocks, apply a base regression algorithm on each of them, and then simply average the…
Conditional kernel mean embeddings are nonparametric models that encode conditional expectations in a reproducing kernel Hilbert space. While they provide a flexible and powerful framework for probabilistic inference, their performance is…
Generalization beyond a training dataset is a main goal of machine learning, but theoretical understanding of generalization remains an open problem for many models. The need for a new theory is exacerbated by recent observations in deep…
The success of kernel-based learning methods depend on the choice of kernel. Recently, kernel learning methods have been proposed that use data to select the most appropriate kernel, usually by combining a set of base kernels. We introduce…
We establish a general form of explicit, input-dependent, measure-valued warpings for learning nonstationary kernels. While stationary kernels are ubiquitous and simple to use, they struggle to adapt to functions that vary in smoothness…
This paper presents an algorithm, Voted Kernel Regularization , that provides the flexibility of using potentially very complex kernel functions such as predictors based on much higher-degree polynomial kernels, while benefitting from…
We propose new bounds on the error of learning algorithms in terms of a data-dependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense…
Kernel methods have great promise for learning rich statistical representations of large modern datasets. However, compared to neural networks, kernel methods have been perceived as lacking in scalability and flexibility. We introduce a…
Standard kernels such as Mat\'ern or RBF kernels only encode simple monotonic dependencies within the input space. Spectral mixture kernels have been proposed as general-purpose, flexible kernels for learning and discovering more…
Similarity learning has received a large amount of interest and is an important tool for many scientific and industrial applications. In this framework, we wish to infer the distance (similarity) between points with respect to an arbitrary…
Spectral kernel methods are techniques for transforming data into a coordinate system that efficiently reveals the geometric structure - in particular, the "connectivity" - of the data. These methods depend on certain tuning parameters. We…
Kernels are powerful and versatile tools in machine learning and statistics. Although the notion of universal kernels and characteristic kernels has been studied, kernel selection still greatly influences the empirical performance. While…
Kernel-based methods exhibit well-documented performance in various nonlinear learning tasks. Most of them rely on a preselected kernel, whose prudent choice presumes task-specific prior information. Especially when the latter is not…