Related papers: Adaptive Split Balancing for Optimal Random Forest

Stochastic Optimization Forests

We study contextual stochastic optimization problems, where we leverage rich auxiliary observations (e.g., product characteristics) to improve decision making with uncertain variables (e.g., demand). We show how to train forest decision…

Optimization and Control · Mathematics 2022-03-17 Nathan Kallus , Xiaojie Mao

Minimax Rates for High-Dimensional Random Tessellation Forests

Random forests are a popular class of algorithms used for regression and classification. The algorithm introduced by Breiman in 2001 and many of its variants are ensembles of randomized decision trees built from axis-aligned partitions of…

Statistics Theory · Mathematics 2023-10-31 Eliza O'Reilly , Ngoc Mai Tran

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates…

Machine Learning · Statistics 2019-05-28 Hanyuan Hang , Xiaoyu Liu , Ingo Steinwart

Collaborative Training of Balanced Random Forests for Open Set Domain Adaptation

In this paper, we introduce a collaborative training algorithm of balanced random forests with convolutional neural networks for domain adaptation tasks. In real scenarios, most domain adaptation algorithms face the challenges from noisy,…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Jongbin Ryu , Jiun Bae , Jongwoo Lim

Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification

Classification of functional data where observations are curves or trajectories poses unique challenges, particularly under severe class imbalance. Traditional Random Forest algorithms, while robust for tabular data, often fail to capture…

Machine Learning · Statistics 2025-12-10 Fahad Mostafa , Hafiz Khan

Adaptive Forests For Classification

Random Forests (RF) and Extreme Gradient Boosting (XGBoost) are two of the most widely used and highly performing classification and regression models. They aggregate equally weighted CART trees, generated randomly in RF or sequentially in…

Machine Learning · Computer Science 2025-10-28 Dimitris Bertsimas , Yubing Cui

Distributional Adaptive Soft Regression Trees

Random forests are an ensemble method relevant for many problems, such as regression or classification. They are popular due to their good predictive performance (compared to, e.g., decision trees) requiring only minimal tuning of…

Methodology · Statistics 2022-10-20 Nikolaus Umlauf , Nadja Klein

A Random Forest Approach for Modeling Bounded Outcomes

Random forests have become an established tool for classification and regression, in particular in high-dimensional settings and in the presence of complex predictor-response relationships. For bounded outcome variables restricted to the…

Methodology · Statistics 2019-01-21 Leonie Weinhold , Matthias Schmid , Marvin N. Wright , Moritz Berger

Infinite random forests for imbalanced classification tasks

We study predictive probability inference in classification tasks using random forests under class imbalance. We focus on two simplified variants of Breiman's algorithm, namely subsampling Infinite Random Forests (IRFs) and under-sampling…

Statistics Theory · Mathematics 2025-05-23 Moria Mayala , Olivier Wintenberger , Charles Tillier , Clément Dombry

SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules

Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive…

Machine Learning · Statistics 2025-01-20 Stamatina Lamprinakou , Huiyan Sang , Bledar A. Konomi , Ligang Lu

Minimax optimal rates for Mondrian trees and forests

Introduced by Breiman, Random Forests are widely used classification and regression algorithms. While being initially designed as batch algorithms, several variants have been proposed to handle online learning. One particular instance of…

Machine Learning · Statistics 2019-04-10 Jaouad Mourtada , Stéphane Gaïffas , Erwan Scornet

Asymptotic Properties of High-Dimensional Random Forests

As a flexible nonparametric learning tool, the random forests algorithm has been widely applied to various real applications with appealing empirical performance, even in the presence of high-dimensional feature space. Unveiling the…

Statistics Theory · Mathematics 2022-09-27 Chien-Ming Chi , Patrick Vossler , Yingying Fan , Jinchi Lv

A novel gradient-based method for decision trees optimizing arbitrary differential loss functions

There are many approaches for training decision trees. This work introduces a novel gradient-based method for constructing decision trees that optimize arbitrary differentiable loss functions, overcoming the limitations of heuristic…

Machine Learning · Computer Science 2025-03-25 Andrei V. Konstantinov , Lev V. Utkin

CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits

We propose a novel algorithm for optimizing multivariate linear threshold functions as split functions of decision trees to create improved Random Forest classifiers. Standard tree induction methods resort to sampling and exhaustive search…

Machine Learning · Computer Science 2015-06-26 Mohammad Norouzi , Maxwell D. Collins , David J. Fleet , Pushmeet Kohli

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

We demonstrate that adaptively controlling the size of individual regression trees in a random forest can improve predictive performance, contrary to the conventional wisdom that trees should be fully grown. A fast pruning algorithm,…

Machine Learning · Statistics 2024-08-15 Nikola Surjanovic , Andrew Henrey , Thomas M. Loughin

Heterogeneous Random Forest

Random forest (RF) stands out as a highly favored machine learning approach for classification problems. The effectiveness of RF hinges on two key factors: the accuracy of individual trees and the diversity among them. In this study, we…

Machine Learning · Computer Science 2024-10-28 Ye-eun Kim , Seoung Yun Kim , Hyunjoong Kim

Classification algorithms using adaptive partitioning

Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set…

Statistics Theory · Mathematics 2014-11-05 Peter Binev , Albert Cohen , Wolfgang Dahmen , Ronald DeVore

Resource-aware Elastic Swap Random Forest for Evolving Data Streams

Continual learning based on data stream mining deals with ubiquitous sources of Big Data arriving at high-velocity and in real-time. Adaptive Random Forest ({\em ARF}) is a popular ensemble method used for continual learning due to its…

Machine Learning · Computer Science 2019-05-16 Diego Marrón , Eduard Ayguadé , José Ramon Herrero , Albert Bifet

Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to…

Data Structures and Algorithms · Computer Science 2026-03-24 David Mestel , Steven Chaplick , Steven Kelk , Ruben Meuwese

Adaptive Concentration of Regression Trees, with Application to Random Forests

We study the convergence of the predictive surface of regression trees and forests. To support our analysis we introduce a notion of adaptive concentration for regression trees. This approach breaks tree training into a model selection…

Statistics Theory · Mathematics 2016-05-03 Stefan Wager , Guenther Walther