Related papers: Node harvest

Experiments with Optimal Model Trees

Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear…

Machine Learning · Computer Science 2026-03-11 Sabino Francesco Roselli , Eibe Frank

Exploiting random projections and sparsity with random forests and gradient boosting methods -- Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested…

Machine Learning · Statistics 2019-05-20 Arnaud Joly

Random Planted Forest: a directly interpretable tree ensemble

We introduce a novel interpretable tree based algorithm for prediction in a regression setting. Our motivation is to estimate the unknown regression function from a functional decomposition perspective in which the functional components…

Machine Learning · Statistics 2023-08-04 Munir Hiabu , Enno Mammen , Joseph T. Meyer

Data Selection: A General Principle for Building Small Interpretable Models

We present convincing empirical evidence for an effective and general strategy for building accurate small models. Such models are attractive for interpretability and also find use in resource-constrained environments. The strategy is to…

Machine Learning · Computer Science 2024-04-30 Abhishek Ghose

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the…

Machine Learning · Computer Science 2022-10-27 Ali Behrouz , Mathias Lecuyer , Cynthia Rudin , Margo Seltzer

Consensus Tree Estimation with False Discovery Rate Control via Partially Ordered Sets

Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and…

Methodology · Statistics 2025-12-01 Maria Alejandra Valdez Cabrera , Amy D Willis , Armeen Taeb

TRUST: Transparent, Robust and Ultra-Sparse Trees

Piecewise-constant regression trees remain popular for their interpretability, yet often lag behind black-box models like Random Forest in predictive accuracy. In this work, we introduce TRUST (Transparent, Robust, and Ultra-Sparse Trees),…

Methodology · Statistics 2025-06-23 Albert Dorador

Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem

Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest…

Econometrics · Economics 2020-12-22 Mochen Yang , Edward McFowland , Gordon Burtch , Gediminas Adomavicius

Example-Based Explanations of Random Forest Predictions

A random forest prediction can be computed by the scalar product of the labels of the training examples and a set of weights that are determined by the leafs of the forest into which the test object falls; each prediction can hence be…

Machine Learning · Computer Science 2023-11-27 Henrik Boström

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity…

Optimization and Control · Mathematics 2020-02-24 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Multi-Model Subset Selection

The two primary approaches for high-dimensional regression problems are sparse methods (e.g., best subset selection, which uses the L0-norm in the penalty) and ensemble methods (e.g., random forests). Although sparse methods typically yield…

Methodology · Statistics 2024-10-31 Anthony-Alexander Christidis , Stefan Van Aelst , Ruben Zamar

Decision Stream: Cultivating Deep Decision Trees

Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at…

Machine Learning · Computer Science 2017-09-05 Dmitry Ignatov , Andrey Ignatov

Soft regression trees: a model variant and a decomposition training algorithm

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized…

Machine Learning · Computer Science 2025-01-28 Antonio Consolo , Edoardo Amaldi , Andrea Manno

Inherently Interpretable Tree Ensemble Learning

Tree ensemble models like random forests and gradient boosting machines are widely used in machine learning due to their excellent predictive performance. However, a high-performance ensemble consisting of a large number of decision trees…

Machine Learning · Statistics 2024-10-28 Zebin Yang , Agus Sudjianto , Xiaoming Li , Aijun Zhang

Distributional Adaptive Soft Regression Trees

Random forests are an ensemble method relevant for many problems, such as regression or classification. They are popular due to their good predictive performance (compared to, e.g., decision trees) requiring only minimal tuning of…

Methodology · Statistics 2022-10-20 Nikolaus Umlauf , Nadja Klein

A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds

Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well…

Machine Learning · Statistics 2021-10-20 Yan Shuo Tan , Abhineet Agarwal , Bin Yu

Lassoed Forests: Random Forests with Adaptive Lasso Post-selection

Random forests are a statistical learning technique that use bootstrap aggregation to average high-variance and low-bias trees. Improvements to random forests, such as applying Lasso regression to the tree predictions, have been proposed in…

Machine Learning · Statistics 2025-11-13 Jing Shang , James Bannon , Benjamin Haibe-Kains , Robert Tibshirani

Generating Compact Tree Ensembles via Annealing

Tree ensembles are flexible predictive models that can capture relevant variables and to some extent their interactions in a compact and interpretable manner. Most algorithms for obtaining tree ensembles are based on versions of boosting or…

Machine Learning · Statistics 2020-02-21 Gitesh Dawer , Yangzi Guo , Adrian Barbu

Born-Again Tree Ensembles

The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix…

Machine Learning · Computer Science 2021-02-02 Thibaut Vidal , Toni Pacheco , Maximilian Schiffer

Theoretical and Empirical Advances in Forest Pruning

Regression forests have long delivered state-of-the-art accuracy, often outperforming regression trees and even neural networks, but they suffer from limited interpretability as ensemble methods. In this work, we revisit forest pruning, an…

Machine Learning · Statistics 2025-03-10 Albert Dorador