Related papers: S-SIRUS: an explainability algorithm for spatial r…

Interpretable Random Forests via Rule Extraction

We introduce SIRUS (Stable and Interpretable RUle Set) for regression, a stable rule learning algorithm which takes the form of a short and simple list of rules. State-of-the-art learning algorithms are often referred to as "black boxes"…

Machine Learning · Statistics 2021-02-11 Clément Bénard , Gérard Biau , Sébastien da Veiga , Erwan Scornet

SIRUS: Stable and Interpretable RUle Set for Classification

State-of-the-art learning algorithms, such as random forests or neural networks, are often qualified as "black-boxes" because of the high number and complexity of operations involved in their prediction mechanism. This lack of…

Machine Learning · Statistics 2020-12-17 Clément Bénard , Gérard Biau , Sébastien da Veiga , Erwan Scornet

A path in regression Random Forest looking for spatial dependence: a taxonomy and a systematic review

Random Forest (RF) is a well-known data-driven algorithm applied in several fields thanks to its flexibility in modeling the relationship between the response variable and the predictors, also in case of strong non-linearities. In…

Machine Learning · Statistics 2023-10-18 Luca Patelli , Michela Cameletti , Natalia Golini , Rosaria Ignaccolo

Random Forests for dependent data

Random forest (RF) is one of the most popular methods for estimating regression functions. The local nature of the RF algorithm, based on intra-node means and variances, is ideal when errors are i.i.d. For dependent error processes like…

Machine Learning · Statistics 2021-06-29 Arkajyoti Saha , Sumanta Basu , Abhirup Datta

Boosting SISSO Performance on Small Sample Datasets by Using Random Forests Prescreening for Complex Feature Selection

In materials science, data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates. Symbolic regression is a key to extracting material descriptors from large datasets, in particular…

Machine Learning · Computer Science 2024-10-01 Xiaolin Jiang , Guanqi Liu , Jiaying Xie , Zhenpeng Hu

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learning techniques for prediction problems. In this article, we discuss some cases where random forests may suffer and propose a novel generalized RF method, namely…

Machine Learning · Statistics 2019-04-24 Haozhe Zhang , Dan Nettleton , Zhengyuan Zhu

An Approximation Method for Fitted Random Forests

Random Forests (RF) is a popular machine learning method for classification and regression problems. It involves a bagging application to decision tree models. One of the primary advantages of the Random Forests model is the reduction in…

Machine Learning · Statistics 2022-07-06 Sai K Popuri

Visualizing Random Forest with Self-Organising Map

Random Forest (RF) is a powerful ensemble method for classification and regression tasks. It consists of decision trees set. Although, a single tree is well interpretable for human, the ensemble of trees is a black-box model. The popular…

Machine Learning · Computer Science 2014-07-17 Piotr Płoński , Krzysztof Zaremba

Enhanced Local Explainability and Trust Scores with Random Forest Proximities

We initiate a novel approach to explain the predictions and out of sample performance of random forest (RF) regression and classification models by exploiting the fact that any RF can be mathematically formulated as an adaptive weighted K…

Machine Learning · Statistics 2024-08-07 Joshua Rosaler , Dhruv Desai , Bhaskarjit Sarmah , Dimitrios Vamvourellis , Deran Onay , Dhagash Mehta , Stefano Pasquali

Random Forest Missing Data Algorithms

Random forest (RF) missing data algorithms are an attractive approach for dealing with missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity,…

Machine Learning · Statistics 2017-01-23 Fei Tang , Hemant Ishwaran

ggRandomForests: Visually Exploring a Random Forest for Regression

Random Forests [Breiman:2001] (RF) are a fully non-parametric statistical method requiring no distributional assumptions on covariate relation to the response. RF are a robust, nonlinear technique that optimizes predictive accuracy by…

Computation · Statistics 2016-12-30 John Ehrlinger

Sparse residual tree and forest

Sparse residual tree (SRT) is an adaptive exploration method for multivariate scattered data approximation. It leads to sparse and stable approximations in areas where the data is sufficient or redundant, and points out the possible local…

Numerical Analysis · Mathematics 2019-05-15 Xin Xu , Xiaopeng Luo

Random Forest regression for manifold-valued responses

An increasing array of biomedical and computer vision applications requires the predictive modeling of complex data, for example images and shapes. The main challenge when predicting such objects lies in the fact that they do not comply to…

Machine Learning · Statistics 2017-02-17 Dimosthenis Tsagkrasoulis , Giovanni Montana

Improving Random Forests by Smoothing

Random forest regression is a powerful non-parametric method that adapts to local data characteristics through data-driven partitioning, making it effective across diverse application domains. However, the piecewise constant nature of…

Machine Learning · Computer Science 2026-05-19 Ziyi Liu , Phuc Luong , Mario Boley , Daniel F. Schmidt

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

ROSE Random Forests for Robust Semiparametric Efficient Estimation

It is widely recognised that semiparametric efficient estimation can be hard to achieve in practice: estimators that are in theory efficient may require unattainable levels of accuracy for the estimation of complex nuisance functions. As a…

Statistics Theory · Mathematics 2024-12-18 Elliot H. Young , Rajen D. Shah

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data,…

Machine Learning · Statistics 2022-10-13 Domagoj Ćevid , Loris Michel , Jeffrey Näf , Nicolai Meinshausen , Peter Bühlmann

Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success

Predicting rare outcomes such as startup success is central to venture capital, demanding models that are both accurate and interpretable. We introduce Random Rule Forest (RRF), a lightweight ensemble method that uses a large language model…

Artificial Intelligence · Computer Science 2025-09-17 Ben Griffin , Diego Vidaurre , Ugur Koyluoglu , Joseph Ternasky , Fuat Alican , Yigit Ihlamur

Probabilistic Random Forest: A machine learning algorithm for noisy datasets

Machine learning (ML) algorithms become increasingly important in the analysis of astronomical data. However, since most ML algorithms are not designed to take data uncertainties into account, ML based studies are mostly restricted to data…

Instrumentation and Methods for Astrophysics · Physics 2018-12-26 Itamar Reis , Dalya Baron , Sahar Shahaf

Spatially Coherent Random Forests

Spatially Coherent Random Forest (SCRF) extends Random Forest to create spatially coherent labeling. Each split function in SCRF is evaluated based on a traditional information gain measure that is regularized by a spatial coherency term.…

Computer Vision and Pattern Recognition · Computer Science 2015-12-08 Tal Remez , Shai Avidan