Related papers: Nonparametric Density Estimation for High-Dimensio…
One of the fundamental problems in machine learning is the estimation of a probability distribution from data. Many techniques have been proposed to study the structure of data, most often building around the assumption that observations…
Important information concerning a multivariate data set, such as clusters and modal regions, is contained in the derivatives of the probability density function. Despite this importance, nonparametric estimation of higher order derivatives…
Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach providing this description in the form of a topography of…
A novel nonparametric clustering algorithm is proposed using the interpoint distances between the members of the data to reveal the inherent clustering structure existing in the given set of data, where we apply the classical nonparametric…
It is now practically the norm for data to be very high dimensional in areas such as genetics, machine vision, image analysis and many others. When analyzing such data, parametric models are often too inflexible while nonparametric…
The ratio between two probability density functions is an important component of various tasks, including selection bias correction, novelty detection and classification. Recently, several estimators of this ratio have been proposed. Most…
Statistical inference in high dimensional settings has recently attracted enormous attention within the literature. However, most published work focuses on the parametric linear regression problem. This paper considers an important…
Density estimation plays a crucial role in many data analysis tasks, as it infers a continuous probability density function (PDF) from discrete samples. Thus, it is used in tasks as diverse as analyzing population data, spatial locations in…
Density estimation plays a key role in many tasks in machine learning, statistical inference, and visualization. The main bottleneck in high-dimensional density estimation is the prohibitive computational cost and the slow convergence rate.…
Parametric density estimation, for example as Gaussian distribution, is the base of the field of statistics. Machine learning requires inexpensive estimation of much more complex densities, and the basic approach is relatively costly…
We propose a novel and computationally efficient approach for nonparametric conditional density estimation in high-dimensional settings that achieves dimension reduction without imposing restrictive distributional or functional form…
This paper studies the problems of identifiability and estimation in high-dimensional nonparametric latent structure models. We introduce an identifiability theorem that generalizes existing conditions, establishing a unified framework…
The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation…
Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. The existing methods usually consider the case when each instance has a fixed,…
Density-based clustering relies on the idea of linking groups to some specific features of the probability distribution underlying the data. The reference to a true, yet unknown, population structure allows to frame the clustering problem…
High dimensional data analysis is known to be as a challenging problem. In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It…
Nonparametric density estimators are studied for $d$-dimensional, strongly spatial mixing data which is defined on a general $N$-dimensional lattice structure. We consider linear and nonlinear hard thresholded wavelet estimators which are…
Density based spatial clustering of points in $\mathbb{R}^n$ has a myriad of applications in a variety of industries. We generalise this problem to the density based clustering of lines in high-dimensional spaces, keeping in mind there…
This invited paper proposes and discusses several Bayesian attempts at nonparametric and semiparametric density estimation. The main categories of these ideas are as follows: 1) Build a nonparametric prior around a given parametric model.…
We present a nonparametric method for selecting informative features in high-dimensional clustering problems. We start with a screening step that uses a test for multimodality. Then we apply kernel density estimation and mode clustering to…