Related papers: Global Sequential Testing for Multi-Stream Auditin…
We describe group sequential tests which efficiently incorporate information from multiple endpoints allowing for early stopping at pre-planned interim analyses. We formulate a testing procedure where several outcomes are examined, and…
This paper addresses the following general scenario: A scientist wishes to perform a battery of experiments, each generating a sequential stream of data, to investigate some phenomenon. The scientist would like to control the overall error…
Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or related procedures when an initial sample does not provide a…
We consider a quantum system that is being continuously monitored, giving rise to a measurement signal. From such a stream of data, information needs to be inferred about the underlying system's dynamics. Here we focus on hypothesis testing…
Multi-stream sequential change detection involves simultaneously monitoring many streams of data and trying to detect when their distributions change, if at all. Here, we theoretically study multiple testing issues that arise from detecting…
This work considers the problem of detecting signals from multiple sequentially observed data streams, where only one stream can be observed at every time instant. The goal is to detect signals as quickly as possible while controlling the…
In this paper, we derive power guarantees of some sequential tests for bounded mean under general alternatives. We focus on testing procedures using nonnegative supermartingales which are anytime valid and consider alternatives which…
The problem of simultaneously testing the marginal distributions of sequentially monitored, independent data streams is considered. The decisions for the various testing problems can be made at different times, using data from all streams,…
Online evaluation of machine learning models is typically conducted through A/B experiments. Sequential statistical tests are valuable tools for analysing these experiments, as they enable researchers to stop data collection early without…
In a Monte-Carlo test, the observed dataset is fixed, and several resampled or permuted versions of the dataset are generated in order to test a null hypothesis that the original dataset is exchangeable with the resampled/permuted ones.…
The sequential multiple testing problem is considered under two generalized error metrics. Under the first one, the probability of at least $k$ mistakes, of any kind, is controlled. Under the second, the probabilities of at least $k_1$…
We develop conservative tests for the mean of a bounded population under stratified sampling and apply them to risk-limiting post-election audits. The tests are ``anytime valid'' under sequential sampling, allowing optional stopping in each…
The $\gamma$-FDP and $k$-FWER multiple testing error metrics, which are tail probabilities of the respective error statistics, have become popular recently as less-stringent alternatives to the FDR and FWER. We propose general and flexible…
The problem of sequential anomaly detection and identification is considered, where multiple data sources are simultaneously monitored and the goal is to identify in real time those, if any, that exhibit ``anomalous" statistical behavior.…
This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple…
We revisit test-time scaling for language model reasoning and ask a fundamental question: at equal token budget and compute, is it better to run multiple independent chains in parallel, or to run fewer chains that iteratively refine through…
In complex clinical trials, multiple research objectives are often grouped into sets of objectives based on their inherent hierarchical relationships. Consequently, the hypotheses formulated to address these objectives are grouped into…
We consider the problem of sequential hypothesis testing by betting. For a general class of composite testing problems -- which include bounded mean testing, equal mean testing for bounded random tuples, and some key ingredients of…
We want to select the best systems out of a given set of systems (or rank them) with respect to their expected performance. The systems allow random observations only and we assume that the joint observation of the systems has a…
We propose a general and flexible procedure for testing multiple hypotheses about sequential (or streaming) data that simultaneously controls both the false discovery rate (FDR) and false nondiscovery rate (FNR) under minimal assumptions…