Related papers: Grafting: Making Random Forests Consistent

Narrowing the Gap: Random Forests In Theory and In Practice

Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of…

Machine Learning · Statistics 2013-10-08 Misha Denil , David Matheson , Nando de Freitas

Asymptotic Properties of High-Dimensional Random Forests

As a flexible nonparametric learning tool, the random forests algorithm has been widely applied to various real applications with appealing empirical performance, even in the presence of high-dimensional feature space. Unveiling the…

Statistics Theory · Mathematics 2022-09-27 Chien-Ming Chi , Patrick Vossler , Yingying Fan , Jinchi Lv

On the Consistency of a Random Forest Algorithm in the Presence of Missing Entries

This paper tackles the problem of constructing a non-parametric predictor when the latent variables are given with incomplete information. The convenient predictor for this task is the random forest algorithm in conjunction to the so-called…

Statistics Theory · Mathematics 2023-09-01 Irving Gómez-Méndez , Emilien Joly

Pure interaction effects unseen by Random Forests

Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during…

Machine Learning · Statistics 2025-08-04 Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph Theo Meyer

Consistency of Online Random Forests

As a testament to their success, the theory of random forests has long been outpaced by their application in practice. In this paper, we take a step towards narrowing this gap by providing a consistency result for online random forests.

Machine Learning · Statistics 2013-05-09 Misha Denil , David Matheson , Nando de Freitas

Consistency of random forests

Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5--32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical…

Statistics Theory · Mathematics 2015-08-11 Erwan Scornet , Gérard Biau , Jean-Philippe Vert

Consistency of Oblique Decision Tree and its Boosting and Random Forest

Classification and Regression Tree (CART), Random Forest (RF) and Gradient Boosting Tree (GBT) are probably the most popular set of statistical learning methods. However, their statistical consistency can only be proved under very…

Statistics Theory · Mathematics 2025-02-17 Haoran Zhan , Yu Liu , Yingcun Xia

Consistency of Honest Decision Trees and Random Forests

We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under…

Methodology · Statistics 2026-05-21 Martin Bladt , Rasmus Frigaard Lemvig

Consistency of Random Forest Type Algorithms under a Probabilistic Impurity Decrease Condition

This paper derives a unifying theorem establishing consistency results for a broad class of tree-based algorithms. It improves current results in two aspects. First of all, it can be applied to algorithms that vary from traditional Random…

Statistics Theory · Mathematics 2024-02-22 Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph T. Meyer

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Even naive trees are consistent

The last decade has shed some light on theoretical properties such as their consistency for regression tasks. In the current paper, we propose a new class of very simple learners based on so-called naive trees. These naive trees partition…

Statistics Theory · Mathematics 2024-12-18 Nico Föge , Markus Pauly , Lena Schmid , Marc Ditzhaus

Confidence and Uncertainty Assessment for Distributional Random Forests

The Distributional Random Forest (DRF) is a recently introduced Random Forest algorithm to estimate multivariate conditional distributions. Due to its general estimation procedure, it can be employed to estimate a wide range of targets such…

Statistics Theory · Mathematics 2023-12-20 Jeffrey Näf , Corinne Emmenegger , Peter Bühlmann , Nicolai Meinshausen

Spanning Trees in Random Satisfiability Problems

Working with tree graphs is always easier than with loopy ones and spanning trees are the closest tree-like structures to a given graph. We find a correspondence between the solutions of random K-satisfiability problem and those of spanning…

Disordered Systems and Neural Networks · Physics 2009-11-11 A. Ramezanpour , S. Moghimi-Araghi

Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty…

Machine Learning · Computer Science 2018-11-20 Myriam Tami , Marianne Clausel , Emilie Devijver , Adrien Dulac , Eric Gaussier , Stefan Janaqi , Meriam Chebre

Random Forest Calibration

The Random Forest (RF) classifier is often claimed to be relatively well calibrated when compared with other machine learning methods. Moreover, the existing literature suggests that traditional calibration methods, such as isotonic…

Machine Learning · Computer Science 2025-01-29 Mohammad Hossein Shaker , Eyke Hüllermeier

Slow-Growing Trees

Random Forest's performance can be matched by a single slow-growing tree (SGT), which uses a learning rate to tame CART's greedy algorithm. SGT exploits the view that CART is an extreme case of an iterative weighted least square procedure.…

Machine Learning · Statistics 2021-07-15 Philippe Goulet Coulombe

Random Forests and Networks Analysis

D. Wilson~\cite{[Wi]} in the 1990's described a simple and efficient algorithm based on loop-erased random walks to sample uniform spanning trees and more generally weighted trees or forests spanning a given graph. This algorithm provides a…

Probability · Mathematics 2018-08-29 L. Avena , F. Castell , A. Gaudilliere , C. Melot

Consistency of Random Survival Forests

We prove uniform consistency of Random Survival Forests (RSF), a newly introduced forest ensemble learner for analysis of right-censored survival data. Consistency is proven under general splitting rules, bootstrapping, and random selection…

Statistics Theory · Mathematics 2008-11-19 Hemant Ishwaran , Udaya B. Kogalur

Random Similarity Forests

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Simplifying Random Forests: On the Trade-off between Interpretability and Accuracy

We analyze the trade-off between model complexity and accuracy for random forests by breaking the trees up into individual classification rules and selecting a subset of them. We show experimentally that already a few rules are sufficient…

Machine Learning · Computer Science 2020-12-09 Michael Rapp , Eneldo Loza Mencía , Johannes Fürnkranz