English

Trees-Based Models for Correlated Data

Methodology 2021-08-09 v2 Machine Learning

Abstract

This paper presents a new approach for trees-based regression, such as simple regression tree, random forest and gradient boosting, in settings involving correlated data. We show the problems that arise when implementing standard trees-based regression models, which ignore the correlation structure. Our new approach explicitly takes the correlation structure into account in the splitting criterion, stopping rules and fitted values in the leaves, which induces some major modifications of standard methodology. The superiority of our new approach over trees-based models that do not account for the correlation is supported by simulation experiments and real data analyses.

Keywords

Cite

@article{arxiv.2102.08114,
  title  = {Trees-Based Models for Correlated Data},
  author = {Assaf Rabinowicz and Saharon Rosset},
  journal= {arXiv preprint arXiv:2102.08114},
  year   = {2021}
}