English

Beyond Neyman-Pearson: e-values enable hypothesis testing with a data-driven alpha

Methodology 2024-04-04 v3

Abstract

A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation (e.g. p α\ll \alpha) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post-hoc, after observation of the data -- thereby providing a handle on `roving α\alpha's'. When Type-II risks are taken into consideration, the only admissible decision rules in the post-hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.

Keywords

Cite

@article{arxiv.2205.00901,
  title  = {Beyond Neyman-Pearson: e-values enable hypothesis testing with a data-driven alpha},
  author = {Peter Grünwald},
  journal= {arXiv preprint arXiv:2205.00901},
  year   = {2024}
}

Comments

Third, once again thoroughly revised version. Part of the material in the first version has moved to another paper, "The E-Posterior", to appear in Phil. Trans. Royal Soc. of London Series A. Compared to the second version, the technical treatment in this version has been considerably simplified