English

Linear Aggregation in Tree-based Estimators

Methodology 2021-09-13 v6 Applications Computation

Abstract

Regression trees and their ensemble methods are popular methods for nonparametric regression: they combine strong predictive performance with interpretable estimators. To improve their utility for locally smooth response surfaces, we study regression trees and random forests with linear aggregation functions. We introduce a new algorithm that finds the best axis-aligned split to fit linear aggregation functions on the corresponding nodes, and we offer a quasilinear time implementation. We demonstrate the algorithm's favorable performance on real-world benchmarks and in an extensive simulation study, and we demonstrate its improved interpretability using a large get-out-the-vote experiment. We provide an open-source software package that implements several tree-based estimators with linear aggregation functions.

Keywords

Cite

@article{arxiv.1906.06463,
  title  = {Linear Aggregation in Tree-based Estimators},
  author = {Sören R. Künzel and Theo F. Saarinen and Edward W. Liu and Jasjeet S. Sekhon},
  journal= {arXiv preprint arXiv:1906.06463},
  year   = {2021}
}