English

Local Linear Forests

Machine Learning 2020-09-08 v4 Machine Learning Econometrics Statistics Theory Statistics Theory

Abstract

Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals, and can show poor predictive performance in the presence of strong, smooth effects. Taking the perspective of random forests as an adaptive kernel method, we pair the forest kernel with a local linear regression adjustment to better capture smoothness. The resulting procedure, local linear forests, enables us to improve on asymptotic rates of convergence for random forests with smooth signals, and provides substantial gains in accuracy on both real and simulated data. We prove a central limit theorem valid under regularity conditions on the forest and smoothness constraints, and propose a computationally efficient construction for confidence intervals. Moving to a causal inference application, we discuss the merits of local regression adjustments for heterogeneous treatment effect estimation, and give an example on a dataset exploring the effect word choice has on attitudes to the social safety net. Last, we include simulation results on real and generated data.

Keywords

Cite

@article{arxiv.1807.11408,
  title  = {Local Linear Forests},
  author = {Rina Friedberg and Julie Tibshirani and Susan Athey and Stefan Wager},
  journal= {arXiv preprint arXiv:1807.11408},
  year   = {2020}
}

Comments

Forthcoming in the Journal of Computational and Graphical Statistics