Related papers: Nystr\"om Kernel Mean Embeddings
Kernel mean embeddings are a popular tool that consists in representing probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. When the kernel is characteristic, mean embeddings can be used…
We analyze the Nystr\"om approximation of a positive definite kernel associated with a probability measure. We first prove an improved error bound for the conventional Nystr\"om approximation with i.i.d. sampling and singular-value…
The kernel mean embedding of probability distributions is commonly used in machine learning as an injective mapping from distributions to functions in an infinite dimensional Hilbert space. It allows us, for example, to define a distance…
Kernel techniques are among the most popular and flexible approaches in data science allowing to represent probability measures without loss of information under mild conditions. The resulting mapping called mean embedding gives rise to a…
Kernel methods are powerful tools in various data analysis tasks. Yet, in many cases, their time and space complexity render them impractical for large datasets. Various kernel approximation methods were proposed to overcome this issue,…
Kernel mean embeddings -- integrals of a kernel with respect to a probability distribution -- are essential in Bayesian quadrature, but also widely used in other computational tools for numerical integration or for statistical inference…
Kernel $k$-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear $k$-means clustering algorithm. However, kernel $k$-means clustering is computationally expensive when the…
Many kernel methods suffer from high time and space complexities and are thus prohibitive in big-data applications. To tackle the computational challenge, the Nystr\"om method has been extensively used to reduce time and space complexities…
The functional linear regression model has been widely studied and utilized for dealing with functional predictors. In this paper, we study the Nystr\"om subsampling method, a strategy used to tackle the computational complexities inherent…
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization. It is well known that a major downside for kernel-based models is the high…
We investigate the efficiency of k-means in terms of both statistical and computational requirements. More precisely, we study a Nystr\"om approach to kernel k-means. We analyze the statistical properties of the proposed method and show…
Kernel methods provide an elegant framework for developing nonlinear learning algorithms from simple linear methods. Though these methods have superior empirical performance in several real data applications, their usefulness is inhibited…
We study Nystr\"om type subsampling approaches to large scale kernel methods, and prove learning bounds in the statistical learning setting, where random sampling and high probability estimates are considered. In particular, we prove that…
We propose a novel class of kernels to alleviate the high computational cost of large-scale nonparametric learning with kernel methods. The proposed kernel is defined based on a hierarchical partitioning of the underlying data domain, where…
The Nystrom method has been popular for generating the low-rank approximation of kernel matrices that arise in many machine learning problems. The approximation quality of the Nystrom method depends crucially on the number of selected…
The Nystr\"om method is a convenient heuristic method to obtain low-rank approximations to kernel matrices in nearly linear complexity. Existing studies typically use the method to approximate positive semidefinite matrices with low or…
Kernel ridge regression, in general, is expensive in memory allocation and computation time. This paper addresses low rank approximations and surrogates for kernel ridge regression, which bridge these difficulties. The fundamental…
Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…
Kernel methods underpin many of the most successful approaches in data science and statistics, and they allow representing probability measures as elements of a reproducing kernel Hilbert space without loss of information. Recently, the…
A Hilbert space embedding of a distribution---in short, a kernel mean embedding---has recently emerged as a powerful tool for machine learning and inference. The basic idea behind this framework is to map distributions into a reproducing…