Related papers: Beyond Neyman-Pearson: e-values enable hypothesis …

Post-hoc $\alpha$ Hypothesis Testing and the Post-hoc $p$-value

In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…

Statistics Theory · Mathematics 2025-12-03 Nick W. Koning

On admissibility in post-hoc hypothesis testing

The validity of classical hypothesis testing requires the significance level $\alpha$ be fixed before any statistical analysis takes place. This is a stringent requirement. For instance, it prohibits updating $\alpha$ during (or after) an…

Statistics Theory · Mathematics 2026-01-21 Ben Chugg , Tyron Lardy , Aaditya Ramdas , Peter Grünwald

Confidence and discoveries with e-values

We discuss systematically two versions of confidence regions: those based on p-values and those based on e-values, a recent alternative to p-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are…

Statistics Theory · Mathematics 2024-03-05 Vladimir Vovk , Ruodu Wang

The 'Right' Extension of Type-I Error to Data-Dependent Levels

The literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily…

Statistics Theory · Mathematics 2026-05-28 Nick W. Koning

Continuous Testing: Unifying Tests and E-values

The e-value is swiftly rising in prominence in many applications of hypothesis testing and multiple testing, yet its relationship to classical testing theory remains elusive. We unify e-values and classical testing into a single 'continuous…

Statistics Theory · Mathematics 2025-05-12 Nick W. Koning

E-values as statistical evidence: A comparison to Bayes factors, likelihoods, and p-values

A recurring debate in the philosophy of statistics concerns what, exactly, should count as a measure of evidence for or against a given hypothesis. P-values, likelihood ratios, and Bayes factors all have their defenders. In this paper we…

Methodology · Statistics 2026-03-26 Ben Chugg , Aaditya Ramdas , Peter Grünwald

Post-Hoc Large-Sample Statistical Inference

We derive inferential procedures for large sample sizes that remain valid under data-dependent significance levels (so-called "post-hoc valid inference"). Classical statistical tools require that the significance level -- the "type-I error"…

Statistics Theory · Mathematics 2026-03-10 Ben Chugg , Etienne Gauthier , Michael I. Jordan , Aaditya Ramdas , Ian Waudby-Smith

Bayes Factor Hypothesis Testing in Meta-Analyses: Practical Advantages and Methodological Considerations

Bayesian hypothesis testing via Bayes factors offers a principled alternative to classical p-value methods in meta-analysis, particularly suited to its cumulative and sequential nature. Unlike commonly reported p-values for standard null…

Methodology · Statistics 2026-04-22 Joris Mulder , Robbie C. M. van Aert

Safe Testing

We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study…

Statistics Theory · Mathematics 2023-03-13 Peter Grünwald , Rianne de Heide , Wouter Koolen

Choosing alpha post hoc: the danger of multiple standard significance thresholds

A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…

Applications · Statistics 2025-03-11 Jesse Hemerik , Nick W Koning

E-Values Expand the Scope of Conformal Prediction

Conformal prediction is a powerful framework for distribution-free uncertainty quantification. The standard approach to conformal prediction relies on comparing the ranks of prediction scores: under exchangeability, the rank of a future…

Machine Learning · Statistics 2025-05-07 Etienne Gauthier , Francis Bach , Michael I. Jordan

E-values: Calibration, combination, and applications

Multiple testing of a single hypothesis and testing multiple hypotheses are usually done in terms of p-values. In this paper we replace p-values with their natural competitor, e-values, which are closely related to betting, Bayes factors,…

Statistics Theory · Mathematics 2021-10-26 Vladimir Vovk , Ruodu Wang

Prediction-Powered E-Values

Quality statistical inference requires a sufficient amount of data, which can be missing or hard to obtain. To this end, prediction-powered inference has risen as a promising methodology, but existing approaches are largely limited to…

Machine Learning · Statistics 2025-05-27 Daniel Csillag , Claudio José Struchiner , Guilherme Tegoni Goedert

E-values as unnormalized weights in multiple testing

We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights:…

Methodology · Statistics 2023-07-19 Nikolaos Ignatiadis , Ruodu Wang , Aaditya Ramdas

The e-value: A Fully Bayesian Significance Measure for Precise Statistical Hypotheses and its Research Program

This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The…

Methodology · Statistics 2020-04-29 Julio Michael Stern , Carlos Alberto de Braganca Pereira

p-Value as the Strength of Evidence Measured by Confidence Distribution

The notion of p-value is a fundamental concept in statistical inference and has been widely used for reporting outcomes of hypothesis tests. However, p-value is often misinterpreted, misused or miscommunicated in practice. Part of the issue…

Methodology · Statistics 2020-02-03 Sifan Liu , Regina Liu , Min-ge Xie

Valid sequential inference on probability forecast performance

Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a correct forecast achieves a minimal…

Methodology · Statistics 2022-07-04 Alexander Henzi , Johanna F. Ziegel

Improved thresholds for e-values

The rejection threshold used for e-values and e-processes is by default set to $1/\alpha$ for a guaranteed type-I error control at $\alpha$, based on Markov's and Ville's inequalities. This threshold can be wasteful in practical…

Statistics Theory · Mathematics 2025-10-06 Christopher Blier-Wong , Ruodu Wang

A unified Bayesian framework for interval hypothesis testing in clinical trials

The American Statistical Association (ASA) statement on statistical significance and P-values \cite{wasserstein2016asa} cautioned statisticians against making scientific decisions solely on the basis of traditional P-values. The statement…

Methodology · Statistics 2024-02-22 Abhisek Chakraborty , Megan H. Murray , Ilya Lipkovich , Yu Du

A note on p-values interpreted as plausibilities

P-values are a mainstay in statistics but are often misinterpreted. We propose a new interpretation of p-value as a meaningful plausibility, where this is to be interpreted formally within the inferential model framework. We show that, for…

Statistics Theory · Mathematics 2014-10-28 Ryan Martin , Chuanhai Liu