English

Covariate powered cross-weighted multiple testing

Methodology 2021-09-01 v7

Abstract

A fundamental task in the analysis of datasets with many variables is screening for associations. This can be cast as a multiple testing task, where the objective is achieving high detection power while controlling type I error. We consider mm hypothesis tests represented by pairs ((Pi,Xi))1im((P_i, X_i))_{1\leq i \leq m} of p-values PiP_i and covariates XiX_i, such that PiXiP_i \perp X_i if HiH_i is null. Here, we show how to use information potentially available in the covariates about heterogeneities among hypotheses to increase power compared to conventional procedures that only use the PiP_i. To this end, we upgrade existing weighted multiple testing procedures through the Independent Hypothesis Weighting (IHW) framework to use data-driven weights that are calculated as a function of the covariates. Finite sample guarantees, e.g., false discovery rate (FDR) control, are derived from cross-weighting, a data-splitting approach that enables learning the weight-covariate function without overfitting as long as the hypotheses can be partitioned into independent folds, with arbitrary within-fold dependence. IHW has increased power compared to methods that do not use covariate information. A key implication of IHW is that hypothesis rejection in common multiple testing setups should not proceed according to the ranking of the p-values, but by an alternative ranking implied by the covariate-weighted p-values.

Keywords

Cite

@article{arxiv.1701.05179,
  title  = {Covariate powered cross-weighted multiple testing},
  author = {Nikolaos Ignatiadis and Wolfgang Huber},
  journal= {arXiv preprint arXiv:1701.05179},
  year   = {2021}
}

Comments

Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2021