Related papers: Improving Random Forests by Smoothing

Local Linear Forests

Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals, and can show poor predictive performance in the presence of strong, smooth effects. Taking the perspective of random…

Machine Learning · Statistics 2020-09-08 Rina Friedberg , Julie Tibshirani , Susan Athey , Stefan Wager

Making Sense of Random Forest Probabilities: a Kernel Perspective

A random forest is a popular tool for estimating probabilities in machine learning classification tasks. However, the means by which this is accomplished is unprincipled: one simply counts the fraction of trees in a forest that vote for a…

Machine Learning · Statistics 2018-12-17 Matthew A. Olson , Abraham J. Wyner

Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers

Despite their remarkable effectiveness and broad application, the drivers of success underlying ensembles of trees are still not fully understood. In this paper, we highlight how interpreting tree ensembles as adaptive and self-regularizing…

Machine Learning · Statistics 2024-02-05 Alicia Curth , Alan Jeffares , Mihaela van der Schaar

Consistency of random forests

Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5--32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical…

Statistics Theory · Mathematics 2015-08-11 Erwan Scornet , Gérard Biau , Jean-Philippe Vert

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data,…

Machine Learning · Statistics 2022-10-13 Domagoj Ćevid , Loris Michel , Jeffrey Näf , Nicolai Meinshausen , Peter Bühlmann

Distributional Adaptive Soft Regression Trees

Random forests are an ensemble method relevant for many problems, such as regression or classification. They are popular due to their good predictive performance (compared to, e.g., decision trees) requiring only minimal tuning of…

Methodology · Statistics 2022-10-20 Nikolaus Umlauf , Nadja Klein

Censored Quantile Regression Forest

Random forests are powerful non-parametric regression method but are severely limited in their usage in the presence of randomly censored observations, and naively applied can exhibit poor predictive performance due to the incurred biases.…

Machine Learning · Statistics 2020-01-13 Alexander Hanbo Li , Jelena Bradic

Censored Quantile Regression Forests

Random forests are powerful non-parametric regression method but are severely limited in their usage in the presence of randomly censored observations, and naively applied can exhibit poor predictive performance due to the incurred biases.…

Machine Learning · Statistics 2019-02-12 Alexander Hanbo Li , Jelena Bradic

Lassoed Forests: Random Forests with Adaptive Lasso Post-selection

Random forests are a statistical learning technique that use bootstrap aggregation to average high-variance and low-bias trees. Improvements to random forests, such as applying Lasso regression to the tree predictions, have been proposed in…

Machine Learning · Statistics 2025-11-13 Jing Shang , James Bannon , Benjamin Haibe-Kains , Robert Tibshirani

Random Spatial Forests

We introduce random spatial forests, a method of bagging regression trees allowing for spatial correlation. Our main contribution is the development of a computationally efficient tree building algorithm which selects each split of the tree…

Methodology · Statistics 2020-07-24 Travis Hee Wai , Michael T. Young , Adam A. Szpiro

Random forests and kernel methods

Random forests are ensemble methods which grow trees as base learners and combine their predictions by averaging. Random forests are known for their good practical performance, particularly in high dimensional set-tings. On the theoretical…

Statistics Theory · Mathematics 2015-09-18 Erwan Scornet

Consistency of Honest Decision Trees and Random Forests

We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under…

Methodology · Statistics 2026-05-21 Martin Bladt , Rasmus Frigaard Lemvig

Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem

Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest…

Econometrics · Economics 2020-12-22 Mochen Yang , Edward McFowland , Gordon Burtch , Gediminas Adomavicius

(f)RFCDE: Random Forests for Conditional Density Estimation and Functional Data

Random forests is a common non-parametric regression technique which performs well for mixed-type unordered data and irrelevant features, while being robust to monotonic variable transformations. Standard random forests, however, do not…

Computation · Statistics 2019-06-19 Taylor Pospisil , Ann B. Lee

Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success

Random forests remain among the most popular off-the-shelf supervised machine learning tools with a well-established track record of predictive accuracy in both regression and classification settings. Despite their empirical success as well…

Machine Learning · Statistics 2020-09-15 Lucas Mentch , Siyu Zhou

Fr\'echet random forests for metric space valued regression with non euclidean predictors

Random forests are a statistical learning method widely used in many areas of scientific research because of its ability to learn complex relationships between input and output variables and also its capacity to handle high-dimensional…

Machine Learning · Statistics 2024-02-19 Louis Capitaine , Jérémie Bigot , Rodolphe Thiébaut , Robin Genuer

An RKHS Perspective on Tree Ensembles

Random Forests and Gradient Boosting are among the most effective algorithms for supervised learning on tabular data. Both belong to the class of tree-based ensemble methods, where predictions are obtained by aggregating many randomized…

Machine Learning · Statistics 2025-12-02 Mehdi Dagdoug , Clement Dombry , Jean-Jil Duchamps

Generalized Random Forests

We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment…

Methodology · Statistics 2018-04-06 Susan Athey , Julie Tibshirani , Stefan Wager

Random Forest Weighted Local Fr\'echet Regression with Random Objects

Statistical analysis is increasingly confronted with complex data from metric spaces. Petersen and M\"uller (2019) established a general paradigm of Fr\'echet regression with complex metric space valued responses and Euclidean predictors.…

Machine Learning · Statistics 2025-02-10 Rui Qiu , Zhou Yu , Ruoqing Zhu

Neural Random Forest Imitation

We present Neural Random Forest Imitation - a novel approach for transforming random forests into neural networks. Existing methods propose a direct mapping and produce very inefficient architectures. In this work, we introduce an imitation…

Machine Learning · Computer Science 2024-04-05 Christoph Reinders , Bodo Rosenhahn