Related papers: Cluster-Based Random Forest Visualization and Inte…

An Unsupervised Random Forest Clustering Technique for Automatic Traffic Scenario Categorization

A modification of the Random Forest algorithm for the categorization of traffic situations is introduced in this paper. The procedure yields an unsupervised machine learning method. The algorithm generates a proximity matrix which contains…

Signal Processing · Electrical Eng. & Systems 2020-04-08 Friedrich Kruber , Jonas Wurst , Michael Botsch

Large Random Forests: Optimisation for Rapid Evaluation

Random Forests are one of the most popular classifiers in machine learning. The larger they are, the more precise is the outcome of their predictions. However, this comes at a cost: their running time for classification grows linearly with…

Machine Learning · Computer Science 2019-12-24 Frederik Gossen , Bernhard Steffen

Unsupervised Decision Forest for Data Clustering and Density Estimation

An algorithm to improve performance parameter for unsupervised decision forest clustering and density estimation is presented. Specifically, a dual assignment parameter is introduced as a density estimator by combining Random Forest and…

Computer Vision and Pattern Recognition · Computer Science 2015-07-19 Hayder Albehadili , Naz Islam

Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift

We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests…

Methodology · Statistics 2026-01-26 Elliot H. Young , Peter Bühlmann

Forest-Guided Clustering -- Shedding Light into the Random Forest Black Box

As machine learning models are increasingly deployed in sensitive application areas, the demand for interpretable and trustworthy decision-making has increased. Random Forests (RF), despite their widespread use and strong performance on…

Machine Learning · Computer Science 2025-07-28 Lisa Barros de Andrade e Sousa , Gregor Miller , Ronan Le Gleut , Dominik Thalmeier , Helena Pelin , Marie Piraud

Visualizing Random Forest with Self-Organising Map

Random Forest (RF) is a powerful ensemble method for classification and regression tasks. It consists of decision trees set. Although, a single tree is well interpretable for human, the ensemble of trees is a black-box model. The popular…

Machine Learning · Computer Science 2014-07-17 Piotr Płoński , Krzysztof Zaremba

Unsupervised and Supervised Learning with the Random Forest Algorithm for Traffic Scenario Clustering and Classification

The goal of this paper is to provide a method, which is able to find categories of traffic scenarios automatically. The architecture consists of three main components: A microscopic traffic simulation, a clustering technique and a…

Signal Processing · Electrical Eng. & Systems 2020-04-08 Friedrich Kruber , Jonas Wurst , Eduardo Sánchez Morales , Samarjit Chakraborty , Michael Botsch

LionForests: Local Interpretation of Random Forests

Towards a future where machine learning systems will integrate into every aspect of people's lives, researching methods to interpret such systems is necessary, instead of focusing exclusively on enhancing their performance. Enriching the…

Machine Learning · Computer Science 2021-12-21 Ioannis Mollas , Nick Bassiliades , Ioannis Vlahavas , Grigorios Tsoumakas

Understanding Random Forests: From Theory to Practice

Data analysis and machine learning have become an integrative part of the modern scientific methodology, offering automated procedures for the prediction of a phenomenon based on past observations, unraveling underlying patterns in data and…

Machine Learning · Statistics 2015-06-04 Gilles Louppe

Cross-Cluster Weighted Forests

Adapting machine learning algorithms to better handle the presence of clusters or batch effects within training datasets is important across a wide variety of biological applications. This article considers the effect of ensembling Random…

Machine Learning · Statistics 2025-04-01 Maya Ramchandran , Rajarshi Mukherjee , Giovanni Parmigiani

Data Selection: A General Principle for Building Small Interpretable Models

We present convincing empirical evidence for an effective and general strategy for building accurate small models. Such models are attractive for interpretability and also find use in resource-constrained environments. The strategy is to…

Machine Learning · Computer Science 2024-04-30 Abhishek Ghose

Tree Index: A New Cluster Evaluation Technique

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation…

Machine Learning · Computer Science 2020-03-25 A. H. Beg , Md Zahidul Islam , Vladimir Estivill-Castro

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Interpretable Clustering via Optimal Trees

State-of-the-art clustering algorithms use heuristics to partition the feature space and provide little insight into the rationale for cluster membership, limiting their interpretability. In healthcare applications, the latter poses a…

Machine Learning · Statistics 2018-12-04 Dimitris Bertsimas , Agni Orfanoudaki , Holly Wiberg

Random Forest for Dissimilarity-based Multi-view Learning

Many classification problems are naturally multi-view in the sense their data are described through multiple heterogeneous descriptions. For such tasks, dissimilarity strategies are effective ways to make the different descriptions…

Machine Learning · Computer Science 2020-07-17 Simon Bernard , Hongliu Cao , Robert Sabourin , Laurent Heutte

Learning Order Forest for Qualitative-Attribute Data Clustering

Clustering is a fundamental approach to understanding data patterns, wherein the intuitive Euclidean distance space is commonly adopted. However, this is not the case for implicit cluster distributions reflected by qualitative attribute…

Machine Learning · Statistics 2026-03-05 Mingjie Zhao , Sen Feng , Yiqun Zhang , Mengke Li , Yang Lu , Yiu-ming Cheung

Combining clustering of variables and feature selection using random forests

Standard approaches to tackle high-dimensional supervised classification problem often include variable selection and dimension reduction procedures. The novel methodology proposed in this paper combines clustering of variables and feature…

Statistics Theory · Mathematics 2018-11-07 Marie Chavent , Robin Genuer , Jerome Saracco

Using Decision Trees for Interpretable Supervised Clustering

In this paper, we address an issue of finding explainable clusters of class-uniform data in labelled datasets. The issue falls into the domain of interpretable supervised clustering. Unlike traditional clustering, supervised clustering aims…

Machine Learning · Computer Science 2023-07-18 Natallia Kokash , Leonid Makhnist

Random Similarity Forests

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Forest Floor Visualizations of Random Forests

We propose a novel methodology, forest floor, to visualize and interpret random forest (RF) models. RF is a popular and useful tool for non-linear multi-variate classification and regression, which yields a good trade-off between robustness…

Machine Learning · Statistics 2016-07-05 Soeren H. Welling , Hanne H. F. Refsgaard , Per B. Brockhoff , Line H. Clemmensen