Related papers: Cluster Forests

Density-based Clustering with Best-scored Random Forest

Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method. In this paper, we propose an algorithm called "best-scored clustering forest" that can obtain the…

Machine Learning · Statistics 2019-06-25 Hanyuan Hang , Yuchao Cai , Hanfang Yang

Unsupervised Decision Forest for Data Clustering and Density Estimation

An algorithm to improve performance parameter for unsupervised decision forest clustering and density estimation is presented. Specifically, a dual assignment parameter is introduced as a density estimator by combining Random Forest and…

Computer Vision and Pattern Recognition · Computer Science 2015-07-19 Hayder Albehadili , Naz Islam

Using Gaussian Measures for Efficient Constraint Based Clustering

In this paper we present a novel iterative multiphase clustering technique for efficiently clustering high dimensional data points. For this purpose we implement clustering feature (CF) tree on a real data set and a Gaussian density…

Machine Learning · Computer Science 2014-11-13 Chandrima Sarkar , Atanu Roy

Clustering by the way of atomic fission

Cluster analysis which focuses on the grouping and categorization of similar elements is widely used in various fields of research. Inspired by the phenomenon of atomic fission, a novel density-based clustering algorithm is proposed in this…

Machine Learning · Computer Science 2020-04-28 Shizhan Lu

Cluster-Based Random Forest Visualization and Interpretation

Random forests are a machine learning method used to automatically classify datasets and consist of a multitude of decision trees. While these random forests often have higher performance and generalize better than a single decision tree,…

Machine Learning · Computer Science 2025-07-31 Max Sondag , Christofer Meinecke , Dennis Collaris , Tatiana von Landesberger , Stef van den Elzen

Clustering High-dimensional Data via Feature Selection

High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called…

Methodology · Statistics 2022-10-31 Tianqi Liu , Yu Lu , Biqing Zhu , Hongyu Zhao

Consensus clustering in complex networks

The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…

Physics and Society · Physics 2012-03-29 Andrea Lancichinetti , Santo Fortunato

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift

We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests…

Methodology · Statistics 2026-01-26 Elliot H. Young , Peter Bühlmann

BETULA: Numerically Stable CF-Trees for BIRCH Clustering

BIRCH clustering is a widely known approach for clustering, that has influenced much subsequent research and commercial products. The key contribution of BIRCH is the Clustering Feature tree (CF-Tree), which is a compressed representation…

Machine Learning · Computer Science 2020-11-30 Andreas Lang , Erich Schubert

A Unified Framework for Fair Spectral Clustering With Effective Graph Learning

We consider the problem of spectral clustering under group fairness constraints, where samples from each sensitive group are approximately proportionally represented in each cluster. Traditional fair spectral clustering (FSC) methods…

Machine Learning · Computer Science 2023-11-27 Xiang Zhang , Qiao Wang

Selective clustering ensemble based on kappa and F-score

Clustering ensemble has an impressive performance in improving the accuracy and robustness of partition results and has received much attention in recent years. Selective clustering ensemble (SCE) can further improve the ensemble…

Machine Learning · Computer Science 2022-04-26 Jie Yan , Xin Liu , Ji Qi , Tao You , Zhong-Yuan Zhang

Novel Framework for Spectral Clustering using Topological Node Features(TNF)

Spectral clustering has gained importance in recent years due to its ability to cluster complex data as it requires only pairwise similarity among data points with its ease of implementation. The central point in spectral clustering is the…

Computer Vision and Pattern Recognition · Computer Science 2017-04-11 Lalith Srikanth Chintalapati , Raghunatha Sarma Rachakonda

Explainable $k$-Means and $k$-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

Explainable cluster analysis: a bagging approach

A major limitation of clustering approaches is their lack of explainability: methods rarely provide insight into which features drive the grouping of similar observations. To address this limitation, we propose an ensemble-based clustering…

Machine Learning · Statistics 2026-03-23 Federico Maria Quetti , Elena Ballante , Silvia Figini , Paolo Giudici

Tree Index: A New Cluster Evaluation Technique

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation…

Machine Learning · Computer Science 2020-03-25 A. H. Beg , Md Zahidul Islam , Vladimir Estivill-Castro

Forest-Guided Clustering -- Shedding Light into the Random Forest Black Box

As machine learning models are increasingly deployed in sensitive application areas, the demand for interpretable and trustworthy decision-making has increased. Random Forests (RF), despite their widespread use and strong performance on…

Machine Learning · Computer Science 2025-07-28 Lisa Barros de Andrade e Sousa , Gregor Miller , Ronan Le Gleut , Dominik Thalmeier , Helena Pelin , Marie Piraud

An expressive dissimilarity measure for relational clustering using neighbourhood trees

Clustering is an underspecified task: there are no universal criteria for what makes a good clustering. This is especially true for relational data, where similarity can be based on the features of individuals, the relationships between…

Machine Learning · Statistics 2017-09-29 Sebastijan Dumancic , Hendrik Blockeel

On Extreme Pruning of Random Forest Ensembles for Real-time Predictive Applications

Random Forest (RF) is an ensemble supervised machine learning technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe…

Machine Learning · Computer Science 2015-03-18 Khaled Fawagreh , Mohamad Medhat Gaber , Eyad Elyan

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains. Inspired by the success of ensemble methods and the flexibility of trees, we propose to learn a similarity…

Machine Learning · Computer Science 2019-08-29 Donghui Yan , Songxiang Gu , Ying Xu , Zhiwei Qin