Related papers: Optimal randomized classification trees

Sparse learning with CART

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to…

Machine Learning · Statistics 2020-11-20 Jason M. Klusowski

MurTree: Optimal Classification Trees via Dynamic Programming and Search

Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy.…

Machine Learning · Computer Science 2022-06-30 Emir Demirović , Anna Lukina , Emmanuel Hebrard , Jeffrey Chan , James Bailey , Christopher Leckie , Kotagiri Ramamohanarao , Peter J. Stuckey

Dive into Decision Trees and Forests: A Theoretical Demonstration

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

Covariance-Driven Regression Trees: Reducing Overfitting in CART

Decision trees are powerful machine learning algorithms, widely used in fields such as economics and medicine for their simplicity and interpretability. However, decision trees such as CART are prone to overfitting, especially when grown…

Machine Learning · Statistics 2026-01-13 Likun Zhang , Wei Ma

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity…

Optimization and Control · Mathematics 2020-02-24 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Experiments with Optimal Model Trees

Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear…

Machine Learning · Computer Science 2026-03-11 Sabino Francesco Roselli , Eibe Frank

Inference with Randomized Regression Trees

Regression trees are a popular machine learning algorithm that fit piecewise constant models by recursively partitioning the predictor space. This paper focuses on statistical inference for a data-dependent model obtained from a fitted…

Methodology · Statistics 2025-12-17 Soham Bakshi , Yiling Huang , Snigdha Panigrahi , Walter Dempsey

Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound

Computing an optimal classification tree that provably maximizes training performance within a given size limit, is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three.…

Machine Learning · Computer Science 2025-01-15 Catalin E. Brita , Jacobus G. M. van der Linden , Emir Demirović

Analyzing CART

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For binary classification and regression models, this approach recursively divides the data into two near-homogenous…

Machine Learning · Statistics 2020-08-17 Jason M. Klusowski

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates…

Machine Learning · Statistics 2019-05-28 Hanyuan Hang , Xiaoyu Liu , Ingo Steinwart

TREE: Tree Regularization for Efficient Execution

The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time…

Machine Learning · Computer Science 2024-06-19 Lena Schmid , Daniel Biebert , Christian Hakert , Kuan-Hsun Chen , Michel Lang , Markus Pauly , Jian-Jia Chen

HHCART: An Oblique Decision Tree

Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic…

Machine Learning · Statistics 2015-04-15 D. C. Wickramarachchi , B. L. Robertson , M. Reale , C. J. Price , J. Brown

Generalized and Scalable Optimal Sparse Decision Trees

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been…

Machine Learning · Computer Science 2022-11-24 Jimmy Lin , Chudi Zhong , Diane Hu , Cynthia Rudin , Margo Seltzer

Dynamic Trees for Learning and Design

Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the…

Methodology · Statistics 2010-11-23 Matthew A. Taddy , Robert B. Gramacy , Nicholas G. Polson

Near Optimal Decision Trees in a SPLIT Second

Decision tree optimization is fundamental to interpretable machine learning. The most popular approach is to greedily search for the best feature at every decision point, which is fast but provably suboptimal. Recent approaches find the…

Machine Learning · Computer Science 2025-11-19 Varun Babbar , Hayden McTavish , Cynthia Rudin , Margo Seltzer

Big Data Regression Using Tree Based Segmentation

Scaling regression to large datasets is a common problem in many application areas. We propose a two step approach to scaling regression to large datasets. Using a regression tree (CART) to segment the large dataset constitutes the first…

Machine Learning · Statistics 2017-07-26 Rajiv Sambasivan , Sourish Das

A novel gradient-based method for decision trees optimizing arbitrary differential loss functions

There are many approaches for training decision trees. This work introduces a novel gradient-based method for constructing decision trees that optimize arbitrary differentiable loss functions, overcoming the limitations of heuristic…

Machine Learning · Computer Science 2025-03-25 Andrei V. Konstantinov , Lev V. Utkin

Classification Trees with Valid Inference via the Exponential Mechanism

Decision trees are widely used for non-linear modeling, as they capture interactions between predictors while producing inherently interpretable models. Despite their popularity, performing inference on the non-linear fit remains largely…

Methodology · Statistics 2026-04-14 Soham Bakshi , Snigdha Panigrahi

Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees

Additive models, such as produced by gradient boosting, and full interaction models, such as classification and regression trees (CART), are widely used algorithms that have been investigated largely in isolation. We show that these models…

Machine Learning · Statistics 2017-11-21 José Marcio Luna , Eric Eaton , Lyle H. Ungar , Eric Diffenderfer , Shane T. Jensen , Efstathios D. Gennatas , Mateo Wirth , Charles B. Simone , Timothy D. Solberg , Gilmer Valdes

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data…

Methodology · Statistics 2022-10-19 Anna C. Neufeld , Lucy L. Gao , Daniela M. Witten