English
Related papers

Related papers: Fully adaptive density-based clustering

200 papers

We derive and analyze a generic, recursive algorithm for estimating all splits in a finite cluster tree as well as the corresponding clusters. We further investigate statistical properties of this generic clustering algorithm when it…

Machine Learning · Statistics 2021-11-02 Ingo Steinwart , Bharath K. Sriperumbudur , Philipp Thomann

Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method. In this paper, we propose an algorithm called "best-scored clustering forest" that can obtain the…

Machine Learning · Statistics 2019-06-25 Hanyuan Hang , Yuchao Cai , Hanfang Yang

Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…

Databases · Computer Science 2017-04-17 Nhien-An Le-Khac , M-Tahar Kechadi

Clustering algorithms are fundamental tools across many fields, with density-based methods offering particular advantages in identifying arbitrarily shaped clusters and handling noise. However, their effectiveness is often limited by the…

Machine Learning · Computer Science 2025-12-01 Meysam Shirdel Bilehsavar , Razieh Ghaedi , Samira Seyed Taheri , Xinqi Fan , Christian O'Reilly

Density level sets can be estimated using plug-in methods, excess mass algorithms or a hybrid of the two previous methodologies. The plug-in algorithms are based on replacing the unknown density by some nonparametric estimator, usually the…

Statistics Theory · Mathematics 2016-11-26 A. Rodríguez-Casal , P. Saavedra-Nieves

In this paper we are going to introduce a new nearest neighbours based approach to clustering, and compare it with previous solutions; the resulting algorithm, which takes inspiration from both DBscan and minimum spanning tree approaches,…

Data Structures and Algorithms · Computer Science 2014-07-14 Marcello La Rocca

After generalizing the concept of clusters to incorporate clusters that are linked to other clusters through some relatively narrow bridges, an approach for detecting patches of separation between these clusters is developed based on an…

Computer Vision and Pattern Recognition · Computer Science 2020-01-10 Luciano da F. Costa

Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex…

Databases · Computer Science 2014-09-24 Eshref Januzaj , Hans-Peter Kriegel , Martin Pfeifle

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based…

Machine Learning · Computer Science 2025-08-06 Ninh Pham , Yingtao Zheng , Hugo Phibbs

Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes…

Machine Learning · Computer Science 2012-07-03 Konstantina Palla , David Knowles , Zoubin Ghahramani

The determination of cluster centers generally depends on the scale that we use to analyze the data to be clustered. Inappropriate scale usually leads to unreasonable cluster centers and thus unreasonable results. In this study, we first…

Machine Learning · Statistics 2016-10-20 Xiurui Geng , Hairong Tang

Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…

Methodology · Statistics 2025-12-12 David Buch , Miheer Dewaskar , David B. Dunson

We study generalized density-based clustering in which sharply defined clusters such as clusters on lower-dimensional manifolds are allowed. We show that accurate clustering is possible even in high dimensions. We propose two data-based…

Statistics Theory · Mathematics 2010-11-11 Alessandro Rinaldo , Larry Wasserman

We analyze the clustering problem through a flexible probabilistic model that aims to identify an optimal partition on the sample X 1 , ..., X n. We perform exact clustering with high probability using a convex semidefinite estimator that…

Statistics Theory · Mathematics 2017-05-19 Martin Royer

For a density $f$ on ${\mathbb R}^d$, a {\it high-density cluster} is any connected component of $\{x: f(x) \geq \lambda\}$, for some $\lambda > 0$. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of…

Machine Learning · Statistics 2014-06-09 Kamalika Chaudhuri , Sanjoy Dasgupta , Samory Kpotufe , Ulrike von Luxburg

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Mode clustering is a nonparametric method for clustering that defines clusters using the basins of attraction of a density estimator's modes. We provide several enhancements to mode clustering: (i) a soft variant of cluster assignment, (ii)…

Methodology · Statistics 2015-12-23 Yen-Chi Chen , Christopher R. Genovese , Larry Wasserman

A main task in data analysis is to organize data points into coherent groups or clusters. The stochastic block model is a probabilistic model for the cluster structure. This model prescribes different probabilities for the presence of edges…

Machine Learning · Computer Science 2020-09-24 Alexander Jung

We present a clustering method and provide a theoretical analysis and an explanation to a phenomenon encountered in the applied statistical literature since the 1990's. This phenomenon is the natural adaptability of the order when using a…

Statistics Theory · Mathematics 2022-03-23 Thierry Dumont

With the recent growth in data availability and complexity, and the associated outburst of elaborate modelling approaches, model selection tools have become a lifeline, providing objective criteria to deal with this increasingly challenging…

Methodology · Statistics 2020-10-08 Alessandro Casa , Luca Scrucca , Giovanna Menardi
‹ Prev 1 2 3 10 Next ›