Related papers: Sparse learning with CART

Analyzing CART

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For binary classification and regression models, this approach recursively divides the data into two near-homogenous…

Machine Learning · Statistics 2020-08-17 Jason M. Klusowski

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Covariance-Driven Regression Trees: Reducing Overfitting in CART

Decision trees are powerful machine learning algorithms, widely used in fields such as economics and medicine for their simplicity and interpretability. However, decision trees such as CART are prone to overfitting, especially when grown…

Machine Learning · Statistics 2026-01-13 Likun Zhang , Wei Ma

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity…

Optimization and Control · Mathematics 2020-02-24 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

HHCART: An Oblique Decision Tree

Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic…

Machine Learning · Statistics 2015-04-15 D. C. Wickramarachchi , B. L. Robertson , M. Reale , C. J. Price , J. Brown

Penalized Split Criteria for Interpretable Trees

This paper describes techniques for growing classification and regression trees designed to induce visually interpretable trees. This is achieved by penalizing splits that extend the subset of features used in a particular branch of the…

Methodology · Statistics 2013-10-22 Alex Goldstein , Andreas Buja

Measure Inducing Classification and Regression Trees for Functional Data

We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing…

Machine Learning · Statistics 2020-11-03 Edoardo Belli , Simone Vantini

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data…

Methodology · Statistics 2022-10-19 Anna C. Neufeld , Lucy L. Gao , Daniela M. Witten

Soft regression trees: a model variant and a decomposition training algorithm

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized…

Machine Learning · Computer Science 2025-01-28 Antonio Consolo , Edoardo Amaldi , Andrea Manno

Large Scale Prediction with Decision Trees

This paper shows that decision trees constructed with Classification and Regression Trees (CART) and C4.5 methodology are consistent for regression and classification tasks, even when the number of predictor variables grows…

Machine Learning · Statistics 2023-11-15 Jason M. Klusowski , Peter M. Tian

Experiments with Optimal Model Trees

Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear…

Machine Learning · Computer Science 2026-03-11 Sabino Francesco Roselli , Eibe Frank

A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds

Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well…

Machine Learning · Statistics 2021-10-20 Yan Shuo Tan , Abhineet Agarwal , Bin Yu

Big Data Regression Using Tree Based Segmentation

Scaling regression to large datasets is a common problem in many application areas. We propose a two step approach to scaling regression to large datasets. Using a regression tree (CART) to segment the large dataset constitutes the first…

Machine Learning · Statistics 2017-07-26 Rajiv Sambasivan , Sourish Das

Bayesian Regression Tree Ensembles that Adapt to Smoothness and Sparsity

Ensembles of decision trees are a useful tool for obtaining for obtaining flexible estimates of regression functions. Examples of these methods include gradient boosted decision trees, random forests, and Bayesian CART. Two potential…

Methodology · Statistics 2018-09-18 Antonio Ricardo Linero , Yun Yang

Stabilizing the Splits through Minimax Decision Trees

By revisiting the end-cut preference (ECP) phenomenon associated with a single CART (Breiman et al. (1984)), we introduce MinimaxSplit decision trees, a robust alternative to CART that selects splits by minimizing the worst-case child risk…

Statistics Theory · Mathematics 2026-04-16 Zhenyuan Zhang , Hengrui Luo

Near Optimal Decision Trees in a SPLIT Second

Decision tree optimization is fundamental to interpretable machine learning. The most popular approach is to greedily search for the best feature at every decision point, which is fast but provably suboptimal. Recent approaches find the…

Machine Learning · Computer Science 2025-11-19 Varun Babbar , Hayden McTavish , Cynthia Rudin , Margo Seltzer

Output-Constrained Decision Trees

Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained…

Machine Learning · Computer Science 2026-04-06 Hüseyin Tunç , Doğanay Özese , Ş. İlker Birbil , Donato Maragno , Marco Caserta , Mustafa Baydoğan

A Partitioning Deletion/Substitution/Addition Algorithm for Creating Survival Risk Groups

Accurately assessing a patient's risk of a given event is essential in making informed treatment decisions. One approach is to stratify patients into two or more distinct risk groups with respect to a specific outcome using both clinical…

Methodology · Statistics 2015-03-17 Karen Lostritto , Robert Strawderman , Annette Molinaro

The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation

Recursive decision trees are widely used to estimate heterogeneous causal treatment effects in experimental and observational studies. These methods are typically implemented using CART-type recursive partitioning and are often viewed as…

Statistics Theory · Mathematics 2026-03-19 Matias D. Cattaneo , Jason M. Klusowski , Ruiqi Rae Yu

Bayesian Additive Regression Trees with Model Trees

Bayesian Additive Regression Trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners…

Machine Learning · Statistics 2022-06-07 Estevão B. Prado , Rafael A. Moral , Andrew C. Parnell