Related papers: Big Data Regression Using Tree Based Segmentation

Big Data Classification Using Augmented Decision Trees

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Gradient boosting machine with partially randomized decision trees

The gradient boosting machine is a powerful ensemble-based machine learning method for solving regression problems. However, one of the difficulties of its using is a possible discontinuity of the regression function, which arises when…

Machine Learning · Computer Science 2020-06-22 Andrei V. Konstantinov , Lev V. Utkin

Measure Inducing Classification and Regression Trees for Functional Data

We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing…

Machine Learning · Statistics 2020-11-03 Edoardo Belli , Simone Vantini

Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees

Additive models, such as produced by gradient boosting, and full interaction models, such as classification and regression trees (CART), are widely used algorithms that have been investigated largely in isolation. We show that these models…

Machine Learning · Statistics 2017-11-21 José Marcio Luna , Eric Eaton , Lyle H. Ungar , Eric Diffenderfer , Shane T. Jensen , Efstathios D. Gennatas , Mateo Wirth , Charles B. Simone , Timothy D. Solberg , Gilmer Valdes

Inference with Randomized Regression Trees

Regression trees are a popular machine learning algorithm that fit piecewise constant models by recursively partitioning the predictor space. This paper focuses on statistical inference for a data-dependent model obtained from a fitted…

Methodology · Statistics 2025-12-17 Soham Bakshi , Yiling Huang , Snigdha Panigrahi , Walter Dempsey

Dive into Decision Trees and Forests: A Theoretical Demonstration

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

Condensed Gradient Boosting

This paper presents a computationally efficient variant of gradient boosting for multi-class classification and multi-output regression tasks. Standard gradient boosting uses a 1-vs-all strategy for classifications tasks with more than two…

Machine Learning · Computer Science 2024-07-25 Seyedsaman Emami , Gonzalo Martínez-Muñoz

Gradient tree boosting with random output projections for multi-label classification and multi-output regression

In many applications of supervised learning, multiple classification or regression outputs have to be predicted jointly. We consider several extensions of gradient boosting to address such problems. We first propose a straightforward…

Machine Learning · Statistics 2019-05-21 Arnaud Joly , Louis Wehenkel , Pierre Geurts

Two-stage Best-scored Random Forest for Large-scale Regression

We propose a novel method designed for large-scale regression problems, namely the two-stage best-scored random forest (TBRF). "Best-scored" means to select one regression tree with the best empirical performance out of a certain number of…

Machine Learning · Statistics 2019-05-10 Hanyuan Hang , Yingyi Chen , Johan A. K. Suykens

Adaptive Scaling

Preprocessing data is an important step before any data analysis. In this paper, we focus on one particular aspect, namely scaling or normalization. We analyze various scaling methods in common use and study their effects on different…

Machine Learning · Statistics 2017-09-05 Ting Li , Bingyi Jing , Ningchen Ying , Xianshi Yu

Memory-Efficient Global Refinement of Decision-Tree Ensembles and its Application to Face Alignment

Ren et al. recently introduced a method for aggregating multiple decision trees into a strong predictor by interpreting a path taken by a sample down each tree as a binary vector and performing linear regression on top of these vectors…

Computer Vision and Pattern Recognition · Computer Science 2018-09-05 Nenad Markuš , Ivan Gogić , Igor S. Pandžić , Jörgen Ahlberg

Regression Trees Know Calculus

Regression trees have emerged as a preeminent tool for solving real-world regression problems due to their ability to deal with nonlinearities, interaction effects and sharp discontinuities. In this article, we rather study regression trees…

Machine Learning · Statistics 2025-11-14 Nathan Wycoff

Regression trees for longitudinal and multiresponse data

Previous algorithms for constructing regression tree models for longitudinal and multiresponse data have mostly followed the CART approach. Consequently, they inherit the same selection biases and computational difficulties as CART. We…

Machine Learning · Statistics 2013-05-28 Wei-Yin Loh , Wei Zheng

Sparse learning with CART

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to…

Machine Learning · Statistics 2020-11-20 Jason M. Klusowski

Variable Grouping Based Bayesian Additive Regression Tree

Using ensemble methods for regression has been a large success in obtaining high-accuracy prediction. Examples are Bagging, Random forest, Boosting, BART (Bayesian additive regression tree), and their variants. In this paper, we propose a…

Machine Learning · Computer Science 2019-11-06 Yuhao Su , Jie Ding

A Generalized Stacking for Implementing Ensembles of Gradient Boosting Machines

The gradient boosting machine is one of the powerful tools for solving regression problems. In order to cope with its shortcomings, an approach for constructing ensembles of gradient boosting models is proposed. The main idea behind the…

Machine Learning · Computer Science 2020-10-14 Andrei V. Konstantinov , Lev V. Utkin

Improving the precision of classification trees

Besides serving as prediction models, classification trees are useful for finding important predictor variables and identifying interesting subgroups in the data. These functions can be compromised by weak split selection algorithms that…

Applications · Statistics 2010-11-03 Wei-Yin Loh

Fast Gaussian Process Regression for Big Data

Gaussian Processes are widely used for regression tasks. A known limitation in the application of Gaussian Processes to regression tasks is that the computation of the solution requires performing a matrix inversion. The solution also…

Machine Learning · Computer Science 2017-08-22 Sourish Das , Sasanka Roy , Rajiv Sambasivan

Robust Boosting for Regression Problems

Gradient boosting algorithms construct a regression predictor using a linear combination of ``base learners''. Boosting also offers an approach to obtaining robust non-parametric regression estimators that are scalable to applications with…

Methodology · Statistics 2020-08-11 Xiaomeng Ju , Matías Salibián-Barrera