Related papers: Adaptive Clustering Using Kernel Density Estimator…
The clusters of a distribution are often defined by the connected components of a density level set. However, this definition depends on the user-specified level. We address this issue by proposing a simple, generic algorithm, which uses an…
A novel nonparametric clustering algorithm is proposed using the interpoint distances between the members of the data to reveal the inherent clustering structure existing in the given set of data, where we apply the classical nonparametric…
Despite the ubiquity of kernel-based clustering, surprisingly few statistical guarantees exist beyond settings that consider strong structural assumptions on the data generation process. In this work, we take a step towards bridging this…
We explore the performance of several automatic bandwidth selectors, originally designed for density gradient estimation, as data-based procedures for nonparametric, modal clustering. The key tool to obtain a clustering from density…
We analyze the clustering problem through a flexible probabilistic model that aims to identify an optimal partition on the sample X 1 , ..., X n. We perform exact clustering with high probability using a convex semidefinite estimator that…
An algorithm to improve performance parameter for unsupervised decision forest clustering and density estimation is presented. Specifically, a dual assignment parameter is introduced as a density estimator by combining Random Forest and…
In this paper we are going to introduce a new nearest neighbours based approach to clustering, and compare it with previous solutions; the resulting algorithm, which takes inspiration from both DBscan and minimum spanning tree approaches,…
This article presents an adaptive mean shift algorithm designed for datasets with varying local scale and cluster cardinality. Local distance distributions, from a point to all others, are used to estimate the cardinality of the local…
A natural way to characterize the cluster structure of a dataset is by finding regions containing a high density of data. This can be done in a nonparametric way with a kernel density estimate, whose modes and hence clusters can be found…
We study generalized density-based clustering in which sharply defined clusters such as clusters on lower-dimensional manifolds are allowed. We show that accurate clustering is possible even in high dimensions. We propose two data-based…
Standard agglomerative clustering suggests establishing a new reliable linkage at every step. However, in order to provide adaptive, density-consistent and flexible solutions, we study extracting all the reliable linkages at each step,…
We present a clustering method and provide a theoretical analysis and an explanation to a phenomenon encountered in the applied statistical literature since the 1990's. This phenomenon is the natural adaptability of the order when using a…
This paper presents new methodology for computationally efficient kernel density estimation. It is shown that a large class of kernels allows for exact evaluation of the density estimates using simple recursions. The same methodology can be…
Kernel density estimation is a popular method for estimating unseen probability distributions. However, the convergence of these classical estimators to the true density slows down in high dimensions. Moreover, they do not define meaningful…
We present a new adaptive kernel density estimator based on linear diffusion processes. The proposed estimator builds on existing ideas for adaptive smoothing by incorporating information from a pilot density estimate. In addition, we…
Clustering algorithms are fundamental tools across many fields, with density-based methods offering particular advantages in identifying arbitrarily shaped clusters and handling noise. However, their effectiveness is often limited by the…
A new clustering accuracy measure is proposed to determine the unknown number of clusters and to assess the quality of clustering of a data set given in any dimensional space. Our validity index applies the classical nonparametric…
Density estimation is a fundamental task in statistics and machine learning applications. Kernel density estimation is a powerful tool for non-parametric density estimation in low dimensions; however, its performance is poor in higher…
This paper deals with nonparametric estimation of conditional den-sities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a prelim-inary clustering algorithm on the…
Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…