Related papers: Variable Bandwidth Diffusion Kernels
Kernel estimation techniques, such as mean shift, suffer from one major drawback: the kernel bandwidth selection. The bandwidth can be fixed for all the data set or can vary at each points. Automatic bandwidth selection becomes a real…
Selecting an appropriate kernel is a central challenge in kernel-based spectral methods. In \emph{Kernelized Diffusion Maps} (KDM), the kernel determines the accuracy of the RKHS estimator of a diffusion-type operator and hence the quality…
Diffusion maps are a commonly used kernel-based method for manifold learning, which can reveal intrinsic structures in data and embed them in low dimensions. However, as with most kernel methods, its implementation requires a heavy…
Kernel Density Estimation is a very popular technique of approximating a density function from samples. The accuracy is generally well-understood and depends, roughly speaking, on the kernel decay and local smoothness of the true density.…
We introduce a theory of local kernels, which generalize the kernels used in the standard diffusion maps construction of nonparametric modeling. We prove that evaluating a local kernel on a data set gives a discrete representation of the…
We estimate on a compact interval densities with isolated irregularities, such as discontinuities or discontinuities in some derivatives. From independent and identically distributed observations we construct a kernel estimator with…
Generalization beyond a training dataset is a main goal of machine learning, but theoretical understanding of generalization remains an open problem for many models. The need for a new theory is exacerbated by recent observations in deep…
Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a…
``Benign overfitting'', the ability of certain algorithms to interpolate noisy training data and yet perform well out-of-sample, has been a topic of considerable recent interest. We show, using a fixed design setup, that an important class…
It is a common practice to evaluate probability density function or matter spatial density function from statistical samples. Kernel density estimation is a frequently used method, but to select an optimal bandwidth of kernel estimation,…
We employ kernel-based approaches that use samples from a probability distribution to approximate a Kolmogorov operator on a manifold. The self-tuning variable-bandwidth kernel method [Berry & Harlim, Appl. Comput. Harmon. Anal.,…
Recently, the theory of diffusion maps was extended to a large class of local kernels with exponential decay which were shown to represent various Riemannian geometries on a data set sampled from a manifold embedded in Euclidean space.…
Two ubiquitous aspects of large-scale data analysis are that the data often have heavy-tailed properties and that diffusion-based or spectral-based methods are often used to identify and extract structure of interest. Perhaps surprisingly,…
Diffusion Maps framework is a kernel based method for manifold learning and data analysis that defines diffusion similarities by imposing a Markovian process on the given dataset. Analysis by this process uncovers the intrinsic geometric…
Variable kernel density estimation allows the approximation of a probability density by the mean of differently stretched and rotated kernels centered at given sampling points $y_n\in\mathbb{R}^d,\ n=1,\dots,N$. Up to now, the choice of the…
We investigate the discrepancy principle for choosing smoothing parameters for kernel density estimation. The method is based on the distance between the empirical and estimated distribution functions. We prove some new positive and…
Variational methods are widely used for approximate posterior inference. However, their use is typically limited to families of distributions that enjoy particular conjugacy properties. To circumvent this limitation, we propose a family of…
In this paper, we extend the diffusion maps algorithm on a family of heat kernels that are either local (having exponential decay) or nonlocal (having polynomial decay), arising in various applications. For example, these kernels have been…
We introduce a novel diffusion-based spectral algorithm to tackle regression analysis on high-dimensional data, particularly data embedded within lower-dimensional manifolds. Traditional spectral algorithms often fall short in such…
Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma^2} )$ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how…