English

Adaptive Split Balancing for Optimal Random Forest

Machine Learning 2024-09-04 v2 Machine Learning Statistics Theory Methodology Statistics Theory

Abstract

In this paper, we propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method. Rather than relying on the widely-used random feature selection, we propose a permutation-based balanced splitting criterion. The adaptive split balancing forest (ASBF), achieves minimax optimality under the Lipschitz class. Its localized version, which fits local regressions at the leaf level, attains the minimax rate under the broad H\"older class Hq,β\mathcal{H}^{q,\beta} of problems for any qNq\in\mathbb{N} and β(0,1]\beta\in(0,1]. We identify that over-reliance on auxiliary randomness in tree construction may compromise the approximation power of trees, leading to suboptimal results. Conversely, the proposed less random, permutation-based approach demonstrates optimality over a wide range of models. Although random forests are known to perform well empirically, their theoretical convergence rates are slow. Simplified versions that construct trees without data dependence offer faster rates but lack adaptability during tree growth. Our proposed method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data. Additionally, we establish uniform upper bounds and demonstrate that ASBF improves dimensionality dependence in average treatment effect estimation problems. Simulation studies and real-world applications demonstrate our methods' superior performance over existing random forests.

Keywords

Cite

@article{arxiv.2402.11228,
  title  = {Adaptive Split Balancing for Optimal Random Forest},
  author = {Yuqian Zhang and Weijie Ji and Jelena Bradic},
  journal= {arXiv preprint arXiv:2402.11228},
  year   = {2024}
}
R2 v1 2026-06-28T14:51:43.233Z