Continuous Testing: Unifying Tests and E-values
Abstract
The e-value is swiftly rising in prominence in many applications of hypothesis testing and multiple testing, yet its relationship to classical testing theory remains elusive. We unify e-values and classical testing into a single 'continuous testing' framework: we argue that e-values are simply the continuous generalization of a test. This cements their foundational role in hypothesis testing. Such continuous tests relate to the rejection probability of classical randomized tests, offering the benefits of randomized tests without the downsides of a randomized decision. By generalizing the traditional notion of power, we obtain a unified theory of optimal continuous testing that nests both classical Neyman-Pearson-optimal tests and log-optimal e-values as special cases. This implies the only difference between typical classical tests and typical e-values is a different choice of power target. We visually illustrate this in a Gaussian location model, where such tests are easy to express. Finally, we describe the relationship to the traditional p-value, and show that continuous tests offer a stronger and arguably more appropriate guarantee than p-values when used as a continuous measure of evidence.
Keywords
Cite
@article{arxiv.2409.05654,
title = {Continuous Testing: Unifying Tests and E-values},
author = {Nick W. Koning},
journal= {arXiv preprint arXiv:2409.05654},
year = {2025}
}
Comments
Somehow re-uploaded the old version last week (?!). New abstract + refined expected-utility-optimal testing results, beyond bounded utilities