Related papers: Variational Weighting for Kernel Density Ratios
This tutorial provides a gentle introduction to kernel density estimation (KDE) and recent advances regarding confidence bands and geometric/topological features. We begin with a discussion of basic properties of KDE: the convergence rate…
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most…
Imbalanced data occurs in a wide range of scenarios. The skewed distribution of the target variable elicits bias in machine learning algorithms. One of the popular methods to combat imbalanced data is to artificially balance the data…
This paper introduces a novel kernel density estimator (KDE) based on the generalised exponential (GE) distribution, designed specifically for positive continuous data. The proposed GE KDE offers a mathematically tractable form that avoids…
In the this paper, the authors propose to estimate the density of a targeted population with a weighted kernel density estimator (wKDE) based on a weighted sample. Bandwidth selection for wKDE is discussed. Three mean integrated squared…
We consider bandwidth matrix selection for kernel density estimators (KDEs) of density level sets in $\mathbb{R}^d$, $d \ge 2$. We also consider estimation of highest density regions, which differs from estimating level sets in that one…
This paper studies the use of kernel density estimation (KDE) for linear algebraic tasks involving the kernel matrix of a collection of $n$ data points in $\mathbb R^d$. In particular, we improve upon existing algorithms for computing the…
Several disciplines, like the social sciences, epidemiology, sentiment analysis, or market research, are interested in knowing the distribution of the classes in a population rather than the individual labels of the members thereof.…
Given a set of points $P\subset \mathbb{R}^{d}$ and a kernel $k$, the Kernel Density Estimate at a point $x\in\mathbb{R}^{d}$ is defined as $\mathrm{KDE}_{P}(x)=\frac{1}{|P|}\sum_{y\in P} k(x,y)$. We study the problem of designing a data…
Neural networks have been widely used as predictive models to fit data distribution, and they could be implemented through learning a collection of samples. In many applications, however, the given dataset may contain noisy samples or…
We propose a method for nonparametric density estimation that exhibits robustness to contamination of the training sample. This method achieves robustness by combining a traditional kernel density estimator (KDE) with ideas from classical…
We present a model for generating probabilistic forecasts by combining kernel density estimation (KDE) and quantile regression techniques, as part of the probabilistic load forecasting track of the Global Energy Forecasting Competition…
Machine learning models are increasingly used to predict material properties and accelerate atomistic simulations, but the reliability of their predictions depends on the representativeness of the training data. We present a scalable,…
While robust parameter estimation has been well studied in parametric density estimation, there has been little investigation into robust density estimation in the nonparametric setting. We present a robust version of the popular kernel…
Kernel density estimation (KDE) stands out as a challenging task in machine learning. The problem is defined in the following way: given a kernel function $f(x,y)$ and a set of points $\{x_1, x_2, \cdots, x_n \} \subset \mathbb{R}^d$, we…
The reconstruction of smooth density fields from scattered data points is a procedure that has multiple applications in a variety of disciplines, including Lagrangian (particle-based) models of solute transport in fluids. In random walk…
Kernel Density Estimation (KDE) is a cornerstone of nonparametric statistics, yet it remains sensitive to bandwidth choice, boundary bias, and computational inefficiency. This study revisits KDE through a principled convolutional framework,…
Imbalanced response variable distribution is a common occurrence in data science. In fields such as fraud detection, medical diagnostics, system intrusion detection and many others where abnormal behavior is rarely observed the data under…
This paper presents new methodology for computationally efficient kernel density estimation. It is shown that a large class of kernels allows for exact evaluation of the density estimates using simple recursions. The same methodology can be…
Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized bias at the boundaries of the interval. Motivated by an application in cancer research, we consider a boundary constraint linking…