Related papers: Improved thresholds for e-values

Beyond Neyman-Pearson: e-values enable hypothesis testing with a data-driven alpha

A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation…

Methodology · Statistics 2024-04-04 Peter Grünwald

A Decision Theoretic Approach to A/B Testing

A/B testing is ubiquitous within the machine learning and data science operations of internet companies. Generically, the idea is to perform a statistical test of the hypothesis that a new feature is better than the existing platform---for…

Statistics Theory · Mathematics 2017-10-11 David Goldberg , James E. Johndrow

E-Values, Bayes Risk, Dual Role of Markov's Inequality

Two approaches to hypothesis testing, e-value testing and Bayes risk minimisation, both invoke Markov's inequality to control error probabilities. They differ in which distribution certifies the unit-moment condition: the null for Type I…

Statistics Theory · Mathematics 2026-04-02 Nicholas G. Polson , Daniel Zantedeschi

Post-hoc $\alpha$ Hypothesis Testing and the Post-hoc $p$-value

In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…

Statistics Theory · Mathematics 2025-12-03 Nick W. Koning

Threshold Selection in Univariate Extreme Value Analysis

Threshold selection plays a key role for various aspects of statistical inference of rare events. Most classical approaches tackling this problem for heavy-tailed distributions crucially depend on tuning parameters or critical values to be…

Methodology · Statistics 2019-03-07 Laura Fee Schneider , Andrea Krajina , Tatyana Krivobokova

Adaptive Sign Error Control

In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error…

Methodology · Statistics 2018-01-03 Chaoyu Yu , Peter D. Hoff

The 'Right' Extension of Type-I Error to Data-Dependent Levels

The literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily…

Statistics Theory · Mathematics 2026-05-28 Nick W. Koning

Threshold rules for online sample selection

We consider the following sample selection problem. We observe in an online fashion a sequence of samples, each endowed by a quality. Our goal is to either select or reject each sample, so as to maximize the aggregate quality of the…

Data Structures and Algorithms · Computer Science 2010-07-20 Eric Bach , Shuchi Chawla , Seeun Umboh

Multiple Testing in Generalized Universal Inference

Compared to p-values, e-values provably guarantee safe, valid inference. If the goal is to test multiple hypotheses simultaneously, one can construct e-values for each individual test and then use the recently developed e-BH procedure to…

Methodology · Statistics 2024-12-03 Neil Dey , Ryan Martin , Jonathan P. Williams

Tiny but uniform improvements of adaptive BH procedures via compound e-values

After the seminal Benjamini-Hochberg (BH) procedure for controlling the false discovery rate (FDR) was proposed, dozens of papers have attempted to improve its power by adapting to the unknown proportion of nulls. We observe that most null…

Methodology · Statistics 2026-03-24 Nikolaos Ignatiadis , Ruodu Wang , Aaditya Ramdas

False discovery rate control with e-values

E-values have gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. In brief, e-values are realized by random variables with expectation at most one under the null; examples include…

Statistics Theory · Mathematics 2021-12-16 Ruodu Wang , Aaditya Ramdas

Carefree multiple testing with e-processes

E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not…

Statistics Theory · Mathematics 2025-07-18 Yury Tavyrikov , Jelle J. Goeman , Rianne de Heide

Confidence and discoveries with e-values

We discuss systematically two versions of confidence regions: those based on p-values and those based on e-values, a recent alternative to p-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are…

Statistics Theory · Mathematics 2024-03-05 Vladimir Vovk , Ruodu Wang

Optimal e-values for testing the mean of a bounded random variable against a composite alternative

We derive the unique e-values with optimal (relative) growth rate in the worst case for testing the mean of a bounded random variable, hereby contributing with the first application beyond the assumption of mutually absolutely continuous…

Statistics Theory · Mathematics 2026-01-19 Sebastian Arnold , Eugenio Clerico

Choosing the nominal level post-hoc with knockoffs using e-values

The knockoff filter is a powerful tool for controlled variable selection with false discovery rate (FDR) control. In this paper, we leverage e-values to allow the nominal FDR level to be switched post-hoc, after looking at the data and…

Methodology · Statistics 2026-02-20 Lasse Fischer , Konstantinos Sechidis

Adaptive clinical trials based on design-optimal e-values with automatic curtailment: An application to single-arm trials with binary data

The e-value is gaining traction as a robust alternative to p-values and Bayes factors for quantifying statistical evidence. e-values are a promising method for adaptive clinical trials due to their anytime-validity: e-values ensure type I…

Methodology · Statistics 2026-05-28 Stef Baas , Judith ter Schure , Joost van Rosmalen

Expected loss analysis of thresholded authentication protocols in noisy conditions

A number of authentication protocols have been proposed recently, where at least some part of the authentication is performed during a phase, lasting $n$ rounds, with no error correction. This requires assigning an acceptable threshold for…

Cryptography and Security · Computer Science 2010-09-03 Christos Dimitrakakis , Aikaterini Mitrokotsa , Serge Vaudenay

Choosing alpha post hoc: the danger of multiple standard significance thresholds

A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…

Applications · Statistics 2025-03-11 Jesse Hemerik , Nick W Koning

Selection from Hierarchical Data with Conformal e-values

Distribution-free predictive inference beyond the construction of prediction sets has gained a lot of interest in recent applications. One such application is the selection task, where the objective is to design a reliable selection rule to…

Methodology · Statistics 2025-01-07 Yonghoon Lee , Zhimei Ren

Fault-Tolerant Thresholds for Encoded Ancillae with Homogeneous Errors

I describe a procedure for calculating thresholds for quantum computation as a function of error model given the availability of ancillae prepared in logical states with independent, identically distributed errors. The thresholds are…

Quantum Physics · Physics 2009-11-16 Bryan Eastin