English

Global Sequential Testing for Multi-Stream Auditing

Machine Learning 2026-05-26 v2 Machine Learning

Abstract

Across many risk-sensitive areas, it is critical to continuously audit machine learning systems as we receive more data to quickly determine if they are performing as designed. This auditing task can be modeled as a sequential hypothesis testing problem with kk data streams and a global null hypothesis that asserts the system operates as intended across all kk streams. Under the alternative, the standard global sequential test, which uses a Bonferroni correction, has an expected stopping time of O(lnkα)O\left(\ln \frac{k}{\alpha}\right) for large kk and significance level α\alpha. In this work, we demonstrate that efficient sequential tests, relying on merging martingales via averaging and products rules, provide improved stopping times, and thus more powerful tests against the null. Using these results, we show that a balanced test can match the Bonferroni rate of O(lnkα)O\left(\ln \frac{k}{\alpha}\right) in the sparse regime (just a few non-null streams) while achieving O(1kln1α)O\left(\frac{1}{k}\ln \frac{1}{\alpha}\right) under dense alternatives (many non-null steams). We validate our theory through experiments on both synthetic and real-world data.

Keywords

Cite

@article{arxiv.2602.21479,
  title  = {Global Sequential Testing for Multi-Stream Auditing},
  author = {Beepul Bharti and Ambar Pal and Jeremias Sulam},
  journal= {arXiv preprint arXiv:2602.21479},
  year   = {2026}
}
R2 v1 2026-07-01T10:50:55.963Z