Related papers: Random Planted Forest: a directly interpretable tr…

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data,…

Machine Learning · Statistics 2022-10-13 Domagoj Ćevid , Loris Michel , Jeffrey Näf , Nicolai Meinshausen , Peter Bühlmann

Distributional Adaptive Soft Regression Trees

Random forests are an ensemble method relevant for many problems, such as regression or classification. They are popular due to their good predictive performance (compared to, e.g., decision trees) requiring only minimal tuning of…

Methodology · Statistics 2022-10-20 Nikolaus Umlauf , Nadja Klein

A Random Forest Approach for Modeling Bounded Outcomes

Random forests have become an established tool for classification and regression, in particular in high-dimensional settings and in the presence of complex predictor-response relationships. For bounded outcome variables restricted to the…

Methodology · Statistics 2019-01-21 Leonie Weinhold , Matthias Schmid , Marvin N. Wright , Moritz Berger

Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift

We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests…

Methodology · Statistics 2026-01-26 Elliot H. Young , Peter Bühlmann

Uncharted Forest a Technique for Exploratory Data Analysis

Exploratory data analysis is crucial for developing and understanding classification models from high-dimensional datasets. We explore the utility of a new unsupervised tree ensemble called uncharted forest for visualizing class…

Machine Learning · Statistics 2018-07-03 Casey Kneale , Steven D. Brown

Linear Aggregation in Tree-based Estimators

Regression trees and their ensemble methods are popular methods for nonparametric regression: they combine strong predictive performance with interpretable estimators. To improve their utility for locally smooth response surfaces, we study…

Methodology · Statistics 2021-09-13 Sören R. Künzel , Theo F. Saarinen , Edward W. Liu , Jasjeet S. Sekhon

Measure Inducing Classification and Regression Trees for Functional Data

We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing…

Machine Learning · Statistics 2020-11-03 Edoardo Belli , Simone Vantini

Inference with Randomized Regression Trees

Regression trees are a popular machine learning algorithm that fit piecewise constant models by recursively partitioning the predictor space. This paper focuses on statistical inference for a data-dependent model obtained from a fitted…

Methodology · Statistics 2025-12-17 Soham Bakshi , Yiling Huang , Snigdha Panigrahi , Walter Dempsey

Analyze Additive and Interaction Effects via Collaborative Trees

We present Collaborative Trees, a novel tree model designed for regression prediction, along with its bagging version, which aims to analyze complex statistical associations between features and uncover potential patterns inherent in the…

Methodology · Statistics 2024-05-21 Chien-Ming Chi

Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem

Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest…

Econometrics · Economics 2020-12-22 Mochen Yang , Edward McFowland , Gordon Burtch , Gediminas Adomavicius

Spectrally Deconfounded Random Forests

We introduce a modification of Random Forests to estimate functions when unobserved confounding variables are present. The technique is tailored for high-dimensional settings with many observed covariates. We use spectral deconfounding…

Computation · Statistics 2025-09-25 Markus Ulmer , Cyrill Scheidegger , Peter Bühlmann

Data Selection: A General Principle for Building Small Interpretable Models

We present convincing empirical evidence for an effective and general strategy for building accurate small models. Such models are attractive for interpretability and also find use in resource-constrained environments. The strategy is to…

Machine Learning · Computer Science 2024-04-30 Abhishek Ghose

Understanding Random Forests: From Theory to Practice

Data analysis and machine learning have become an integrative part of the modern scientific methodology, offering automated procedures for the prediction of a phenomenon based on past observations, unraveling underlying patterns in data and…

Machine Learning · Statistics 2015-06-04 Gilles Louppe

Interpretable Network-assisted Random Forest+

Machine learning algorithms often assume that training samples are independent. When data points are connected by a network, the induced dependency between samples is both a challenge, reducing effective sample size, and an opportunity to…

Machine Learning · Statistics 2025-09-22 Tiffany M. Tang , Elizaveta Levina , Ji Zhu

Born-Again Tree Ensembles

The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix…

Machine Learning · Computer Science 2021-02-02 Thibaut Vidal , Toni Pacheco , Maximilian Schiffer

Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty…

Machine Learning · Computer Science 2018-11-20 Myriam Tami , Marianne Clausel , Emilie Devijver , Adrien Dulac , Eric Gaussier , Stefan Janaqi , Meriam Chebre

Regression tree models for designed experiments

Although regression trees were originally designed for large datasets, they can profitably be used on small datasets as well, including those from replicated or unreplicated complete factorial experiments. We show that in the latter…

Statistics Theory · Mathematics 2007-06-13 Wei-Yin Loh

Transformation Forests

Regression models for supervised learning problems with a continuous target are commonly understood as models for the conditional mean of the target given predictors. This notion is simple and therefore appealing for interpretation and…

Methodology · Statistics 2018-01-09 Torsten Hothorn , Achim Zeileis

Local Interpretability of Random Forests for Multi-Target Regression

Multi-target regression is useful in a plethora of applications. Although random forest models perform well in these tasks, they are often difficult to interpret. Interpretability is crucial in machine learning, especially when it can…

Machine Learning · Computer Science 2023-03-30 Avraam Bardos , Nikolaos Mylonas , Ioannis Mollas , Grigorios Tsoumakas

Ranking Perspective for Tree-based Methods with Applications to Symbolic Feature Selection

Tree-based methods are powerful nonparametric techniques in statistics and machine learning. However, their effectiveness, particularly in finite-sample settings, is not fully understood. Recent applications have revealed their surprising…

Statistics Theory · Mathematics 2024-10-04 Hengrui Luo , Meng Li