Identifying important predictors in large data bases -- multiple testing and model selection

Malgorzata Bogdan; Florian Frommlet

Identifying important predictors in large data bases -- multiple testing and model selection

Methodology 2020-11-25 v1

Authors: Malgorzata Bogdan , Florian Frommlet

Abstract

This is a chapter of the forthcoming Handbook of Multiple Testing. We consider a variety of model selection strategies in a high-dimensional setting, where the number of potential predictors p is large compared to the number of available observations n. In particular modifications of information criteria which are suitable in case of p > n are introduced and compared with a variety of penalized likelihood methods, in particular SLOPE and SLOBE. The focus is on methods which control the FDR in terms of model identification. Theoretical results are provided both with respect to model identification and prediction and various simulation results are presented which illustrate the performance of the different methods in different situations.

Keywords

false discovery rate control machine learning statistical inference and model selection

Cite

@article{arxiv.2011.12154,
  title  = {Identifying important predictors in large data bases -- multiple testing and model selection},
  author = {Malgorzata Bogdan and Florian Frommlet},
  journal= {arXiv preprint arXiv:2011.12154},
  year   = {2020}
}

Identifying important predictors in large data bases -- multiple testing and model selection

Abstract

Keywords

Cite

Related papers