English

Gradient Boosted Feature Selection

Machine Learning 2019-01-15 v1 Machine Learning

Abstract

A feature selection algorithm should ideally satisfy four conditions: reliably extract relevant features; be able to identify non-linear feature interactions; scale linearly with the number of features and dimensions; allow the incorporation of known sparsity structure. In this work we propose a novel feature selection algorithm, Gradient Boosted Feature Selection (GBFS), which satisfies all four of these requirements. The algorithm is flexible, scalable, and surprisingly straight-forward to implement as it is based on a modification of Gradient Boosted Trees. We evaluate GBFS on several real world data sets and show that it matches or out-performs other state of the art feature selection algorithms. Yet it scales to larger data set sizes and naturally allows for domain-specific side information.

Keywords

Cite

@article{arxiv.1901.04055,
  title  = {Gradient Boosted Feature Selection},
  author = {Zhixiang Eddie Xu and Gao Huang and Kilian Q. Weinberger and Alice X. Zheng},
  journal= {arXiv preprint arXiv:1901.04055},
  year   = {2019}
}