Related papers: Improved thresholds for e-values
A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation…
A/B testing is ubiquitous within the machine learning and data science operations of internet companies. Generically, the idea is to perform a statistical test of the hypothesis that a new feature is better than the existing platform---for…
Two approaches to hypothesis testing, e-value testing and Bayes risk minimisation, both invoke Markov's inequality to control error probabilities. They differ in which distribution certifies the unit-moment condition: the null for Type I…
In traditional hypothesis testing one must pre-specify the significance level $\alpha$ to bound the `size' of the test: its probability to falsely reject the hypothesis. Indeed, a data-dependent selection of $\alpha$ would generally distort…
Threshold selection plays a key role for various aspects of statistical inference of rare events. Most classical approaches tackling this problem for heavy-tailed distributions crucially depend on tuning parameters or critical values to be…
In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error…
The literature on hypothesis testing with data-dependent and post-hoc significance levels relies on a particular extension of the Type-I error to data-dependent levels. Existing arguments for this extension are heuristic, and primarily…
We consider the following sample selection problem. We observe in an online fashion a sequence of samples, each endowed by a quality. Our goal is to either select or reject each sample, so as to maximize the aggregate quality of the…
Compared to p-values, e-values provably guarantee safe, valid inference. If the goal is to test multiple hypotheses simultaneously, one can construct e-values for each individual test and then use the recently developed e-BH procedure to…
After the seminal Benjamini-Hochberg (BH) procedure for controlling the false discovery rate (FDR) was proposed, dozens of papers have attempted to improve its power by adapting to the unknown proportion of nulls. We observe that most null…
E-values have gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. In brief, e-values are realized by random variables with expectation at most one under the null; examples include…
E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not…
We discuss systematically two versions of confidence regions: those based on p-values and those based on e-values, a recent alternative to p-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are…
We derive the unique e-values with optimal (relative) growth rate in the worst case for testing the mean of a bounded random variable, hereby contributing with the first application beyond the assumption of mutually absolutely continuous…
The knockoff filter is a powerful tool for controlled variable selection with false discovery rate (FDR) control. In this paper, we leverage e-values to allow the nominal FDR level to be switched post-hoc, after looking at the data and…
The e-value is gaining traction as a robust alternative to p-values and Bayes factors for quantifying statistical evidence. e-values are a promising method for adaptive clinical trials due to their anytime-validity: e-values ensure type I…
A number of authentication protocols have been proposed recently, where at least some part of the authentication is performed during a phase, lasting $n$ rounds, with no error correction. This requires assigning an acceptable threshold for…
A fundamental assumption of classical hypothesis testing is that the significance threshold $\alpha$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $\alpha$ beforehand. We point out…
Distribution-free predictive inference beyond the construction of prediction sets has gained a lot of interest in recent applications. One such application is the selection task, where the objective is to design a reliable selection rule to…
I describe a procedure for calculating thresholds for quantum computation as a function of error model given the availability of ancillae prepared in logical states with independent, identically distributed errors. The thresholds are…