Related papers: Multiple Testing in Generalized Universal Inferenc…
We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights:…
Multiple testing of a single hypothesis and testing multiple hypotheses are usually done in terms of p-values. In this paper we replace p-values with their natural competitor, e-values, which are closely related to betting, Bayes factors,…
E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not…
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a correct forecast achieves a minimal…
The e-BH procedure is an e-value-based multiple testing procedure that provably controls the false discovery rate (FDR) under any dependence structure between the e-values. Despite this appealing theoretical FDR control guarantee, the e-BH…
A common goal in statistics and machine learning is estimation of unknowns. Point estimates alone are of little value without an accompanying measure of uncertainty, but traditional uncertainty quantification methods, such as confidence…
A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation…
A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid*…
In this paper we use e-values in the context of multiple hypothesis testing assuming that the base tests produce independent, or sequential, e-values. Our simulation and empirical studies and theoretical considerations suggest that, under…
We discuss systematically two versions of confidence regions: those based on p-values and those based on e-values, a recent alternative to p-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are…
We explicitly define the notions of (bona fide, approximate or asymptotic) compound p-values and e-values, which have been implicitly presented and used in the recent multiple testing literature. While it is known that the e-BH procedure…
Motivated by recent findings in Li and Zhang (2025), which established an equivalence between certain p-value-based multiple testing procedures and the e-Benjamini-Hochberg procedure (Wang and Ramdas, 2022), we introduce a general framework…
Quality statistical inference requires a sufficient amount of data, which can be missing or hard to obtain. To this end, prediction-powered inference has risen as a promising methodology, but existing approaches are largely limited to…
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously. However, the classical statistical tools…
Conformal prediction is a powerful framework for distribution-free uncertainty quantification. The standard approach to conformal prediction relies on comparing the ranks of prediction scores: under exchangeability, the rank of a future…
A hypothesis testing and an interval estimation are studied for the common mean of several lognormal populations. Two methods are given based on the concept of generalized p-value and generalized confidence interval. These new methods are…
This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The…
Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently…
The e-value is swiftly rising in prominence in many applications of hypothesis testing and multiple testing, yet its relationship to classical testing theory remains elusive. We unify e-values and classical testing into a single 'continuous…
It is quite common in modern research, for a researcher to test many hypotheses. The statistical (frequentist) hypothesis testing framework, does not scale with the number of hypotheses in the sense that naively performing many hypothesis…