Related papers: Consistency of Random Forest Type Algorithms under…

Consistency of random forests

Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5--32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical…

Statistics Theory · Mathematics 2015-08-11 Erwan Scornet , Gérard Biau , Jean-Philippe Vert

Consistency of Honest Decision Trees and Random Forests

We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under…

Methodology · Statistics 2026-05-21 Martin Bladt , Rasmus Frigaard Lemvig

Consistency of Online Random Forests

As a testament to their success, the theory of random forests has long been outpaced by their application in practice. In this paper, we take a step towards narrowing this gap by providing a consistency result for online random forests.

Machine Learning · Statistics 2013-05-09 Misha Denil , David Matheson , Nando de Freitas

Regularized impurity reduction: Accurate decision trees with complexity guarantees

Decision trees are popular classification models, providing high accuracy and intuitive explanations. However, as the tree size grows the model interpretability deteriorates. Traditional tree-induction algorithms, such as C4.5 and CART,…

Machine Learning · Computer Science 2022-11-29 Guangyi Zhang , Aristides Gionis

On the asymptotics of random forests

The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains…

Statistics Theory · Mathematics 2014-09-09 Erwan Scornet

Asymptotic Properties of High-Dimensional Random Forests

As a flexible nonparametric learning tool, the random forests algorithm has been widely applied to various real applications with appealing empirical performance, even in the presence of high-dimensional feature space. Unveiling the…

Statistics Theory · Mathematics 2022-09-27 Chien-Ming Chi , Patrick Vossler , Yingying Fan , Jinchi Lv

Random Forests Can Hash

Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first…

Computer Vision and Pattern Recognition · Computer Science 2015-04-20 Qiang Qiu , Guillermo Sapiro , Alex Bronstein

Comparison-Based Random Forests

Assume we are given a set of items from a general metric space, but we neither have access to the representation of the data nor to the distances between data points. Instead, suppose that we can actively choose a triplet of items (A,B,C)…

Machine Learning · Statistics 2018-06-19 Siavash Haghiri , Damien Garreau , Ulrike von Luxburg

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates…

Machine Learning · Statistics 2019-05-28 Hanyuan Hang , Xiaoyu Liu , Ingo Steinwart

Universal consistency and minimax rates for online Mondrian Forests

We establish the consistency of an algorithm of Mondrian Forests, a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm, that considers a fixed lifetime parameter.…

Machine Learning · Statistics 2017-11-09 Jaouad Mourtada , Stéphane Gaïffas , Erwan Scornet

Consistency of Random Survival Forests

We prove uniform consistency of Random Survival Forests (RSF), a newly introduced forest ensemble learner for analysis of right-censored survival data. Consistency is proven under general splitting rules, bootstrapping, and random selection…

Statistics Theory · Mathematics 2008-11-19 Hemant Ishwaran , Udaya B. Kogalur

Narrowing the Gap: Random Forests In Theory and In Practice

Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of…

Machine Learning · Statistics 2013-10-08 Misha Denil , David Matheson , Nando de Freitas

Even naive trees are consistent

The last decade has shed some light on theoretical properties such as their consistency for regression tasks. In the current paper, we propose a new class of very simple learners based on so-called naive trees. These naive trees partition…

Statistics Theory · Mathematics 2024-12-18 Nico Föge , Markus Pauly , Lena Schmid , Marc Ditzhaus

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty…

Machine Learning · Computer Science 2018-11-20 Myriam Tami , Marianne Clausel , Emilie Devijver , Adrien Dulac , Eric Gaussier , Stefan Janaqi , Meriam Chebre

Random Similarity Forests

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Analysis of purely random forests bias

Random forests are a very effective and commonly used statistical method, but their full theoretical analysis is still an open problem. As a first step, simplified models such as purely random forests have been introduced, in order to shed…

Statistics Theory · Mathematics 2014-07-16 Sylvain Arlot , Robin Genuer

On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation

Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of…

Machine Learning · Statistics 2024-02-08 Matias D. Cattaneo , Jason M. Klusowski , Peter M. Tian

Simplifying Random Forests: On the Trade-off between Interpretability and Accuracy

We analyze the trade-off between model complexity and accuracy for random forests by breaking the trees up into individual classification rules and selecting a subset of them. We show experimentally that already a few rules are sufficient…

Machine Learning · Computer Science 2020-12-09 Michael Rapp , Eneldo Loza Mencía , Johannes Fürnkranz

Can a Single Tree Outperform an Entire Forest?

The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy, despite its advantages in interpretability and lightweight structure. This study challenges such a mindset by significantly…

Machine Learning · Computer Science 2024-11-27 Qiangqiang Mao , Yankai Cao