Related papers: Bayes, E-values and Testing
We design sequential tests for a large class of nonparametric null hypotheses based on elicitable and identifiable functionals. Such functionals are defined in terms of scoring functions and identification functions, which are ideal…
A recurring debate in the philosophy of statistics concerns what, exactly, should count as a measure of evidence for or against a given hypothesis. P-values, likelihood ratios, and Bayes factors all have their defenders. In this paper we…
Bayesian Deep Ensembles (BDEs) represent a powerful approach for uncertainty quantification in deep learning, combining the robustness of Deep Ensembles (DEs) with flexible multi-chain MCMC. While DEs are affordable in most deep learning…
Adaptive clinical trials rely on interim analyses, flexible stopping, and data-dependent design modifications that complicate statistical guarantees when fixed-horizon test statistics are repeatedly inspected or reused after adaptations.…
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a correct forecast achieves a minimal…
Confidence sequences, anytime p-values (called p-processes in this paper), and e-processes all enable sequential inference for composite and nonparametric classes of distributions at arbitrary stopping times. Examining the literature, one…
Given a composite null hypothesis H, test supermartingales are non-negative supermartingales with respect to H with initial value 1. Large values of test supermartingales provide evidence against H. As a result, test supermartingales are an…
We develop e-values and e-processes testing the null hypothesis that a distribution over nonnegative integers is monotone, and that a distribution over integers is unimodal given a certain mode. Our e-processes lead to tests of power one…
Given a random sample from a random variable $T$ which is bounded from above, $T\le\tau$ a.s., we define processes that are positive supermartingales if $E(T)\ge\mu$. Such processes are called test martingales. Tests of the supermartingale…
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously. However, the classical statistical tools…
On observing a sequence of i.i.d.\ data with distribution $P$ on $\mathbb{R}^d$, we ask the question of how one can test the null hypothesis that $P$ has a log-concave density. This paper proves one interesting negative and positive result:…
We propose a sequential, anytime-valid method to test the conditional independence of a response $Y$ and a predictor $X$ given a random vector $Z$. The proposed test is based on e-statistics and test martingales, which generalize likelihood…
Confidence sequences based on test martingales provide time-uniform uncertainty quantification for the mean of bounded IID observations without parametric distributional assumptions. Their practical efficiency, however, depends strongly on…
In this paper, we derive power guarantees of some sequential tests for bounded mean under general alternatives. We focus on testing procedures using nonnegative supermartingales which are anytime valid and consider alternatives which…
Conformal prediction is a powerful framework for distribution-free uncertainty quantification. The standard approach to conformal prediction relies on comparing the ranks of prediction scores: under exchangeability, the rank of a future…
Given a positive random variable $X$, $X\ge0$ a.s., a null hypothesis $H_0:E(X)\le\mu$ and a random sample of infinite size of $X$, we construct test supermartingales for $H_0$, i.e. positive processes that are supermartingale if the null…
We propose a new algorithmic framework for sequential hypothesis testing with i.i.d. data, which includes A/B testing, nonparametric two-sample testing, and independence testing as special cases. It is novel in several ways: (a) it takes…
The e-value is gaining traction as a robust alternative to p-values and Bayes factors for quantifying statistical evidence. e-values are a promising method for adaptive clinical trials due to their anytime-validity: e-values ensure type I…
A nonnegative martingale with initial value equal to one measures evidence against a probabilistic hypothesis. The inverse of its value at some stopping time can be interpreted as a Bayes factor. If we exaggerate the evidence by considering…
In a Monte-Carlo test, the observed dataset is fixed, and several resampled or permuted versions of the dataset are generated in order to test a null hypothesis that the original dataset is exchangeable with the resampled/permuted ones.…