Related papers: YEAST: Yet Another Sequential Test

Analysis of Large Scale Web Experiments Using Sequences of Estimators

Experimental testing is vital in the optimization of web applications, and as such A/B testing has been widely adopted as a methodology for determining optimal content for many web applications. While some testing platforms provide…

Methodology · Statistics 2017-10-04 Ian E. Fellows

Rapid and Scalable Bayesian AB Testing

AB testing aids business operators with their decision making, and is considered the gold standard method for learning from data to improve digital user experiences. However, there is usually a gap between the requirements of practitioners,…

Machine Learning · Computer Science 2023-07-28 Srivas Chennu , Andrew Maher , Christian Pangerl , Subash Prabanantham , Jae Hyeon Bae , Jamie Martin , Bud Goswami

An Online Sequential Test for Qualitative Treatment Effects

Tech companies (e.g., Google or Facebook) often use randomized online experiments and/or A/B testing primarily based on the average treatment effects to compare their new product with an old one. However, it is also critically important to…

Methodology · Statistics 2021-11-09 Chengchun Shi , Shikai Luo , Hongtu Zhu , Rui Song

A New Framework for Online Testing of Heterogeneous Treatment Effect

We propose a new framework for online testing of heterogeneous treatment effects. The proposed test, named sequential score test (SST), is able to control type I error under continuous monitoring and detect multi-dimensional heterogeneous…

Methodology · Statistics 2020-02-11 Miao Yu , Wenbin Lu , Rui Song

Robust Sequential Experimental Design for A/B Testing

Experimental design has emerged as a powerful approach for improving the sample efficiency of A/B testing, yet existing designs rely critically on correctly specified models. We study robust sequential experimental design under model…

Machine Learning · Statistics 2026-05-14 Qianglin Wen , Xiangkun Wu , Chengchun Shi , Ting Li , Niansheng Tang , Yingying Zhang , Hongtu Zhu

Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing

A/B testing is one of the most successful applications of statistical theory in modern Internet age. One problem of Null Hypothesis Statistical Testing (NHST), the backbone of A/B testing methodology, is that experimenters are not allowed…

Applications · Statistics 2016-02-18 Alex Deng , Jiannan Lu , Shouyuan Chen

Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

Online experiments in internet systems, also known as A/B tests, are used for a wide range of system tuning problems, such as optimizing recommender system ranking policies and learning adaptive streaming controllers. Decision-makers…

Machine Learning · Computer Science 2025-07-01 Qing Feng , Samuel Daulton , Benjamin Letham , Maximilian Balandat , Eytan Bakshy

Sequential hypothesis testing for continuously-monitored quantum systems

We consider a quantum system that is being continuously monitored, giving rise to a measurement signal. From such a stream of data, information needs to be inferred about the underlying system's dynamics. Here we focus on hypothesis testing…

Quantum Physics · Physics 2024-03-27 Giulio Gasbarri , Matias Bilkis , Elisabet Roda-Salichs , John Calsamiglia

On Post-Selection Inference in A/B Tests

When interpreting A/B tests, we typically focus only on the statistically significant results and take them by face value. This practice, termed post-selection inference in the statistical literature, may negatively affect both point…

Applications · Statistics 2021-06-01 Alex Deng , Yicheng Li , Jiannan Lu , Vivek Ramamurthy

Anytime-Valid Confidence Sequences in an Enterprise A/B Testing Platform

A/B tests are the gold standard for evaluating digital experiences on the web. However, traditional "fixed-horizon" statistical methods are often incompatible with the needs of modern industry practitioners as they do not permit continuous…

Applications · Statistics 2023-02-21 Akash V. Maharaj , Ritwik Sinha , David Arbour , Ian Waudby-Smith , Simon Z. Liu , Moumita Sinha , Raghavendra Addanki , Aaditya Ramdas , Manas Garg , Viswanathan Swaminathan

Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring…

Machine Learning · Computer Science 2023-04-04 Runzhe Wan , Yu Liu , James McQueen , Doug Hains , Rui Song

Safe Sequential Testing and Effect Estimation in Stratified Count Data

Sequential decision making significantly speeds up research and is more cost-effective compared to fixed-n methods. We present a method for sequential decision making for stratified count data that retains Type-I error guarantee or false…

Methodology · Statistics 2023-02-23 Rosanne J. Turner , Peter D. Grünwald

Bayesian Predictive Probabilities for Online Experimentation

The widespread adoption of online randomized controlled experiments (A/B Tests) for decision-making has created ongoing capacity constraints which necessitate interim analyses. As a consequence, platform users are increasingly motivated to…

Applications · Statistics 2025-11-11 Abbas Zaidi , Rina Friedberg , Samir Khan , Yao-Yang Leow , Maulik Soneji , Houssam Nassif , Richard Mudd

Sequential testing problem: A follow-up review

This review aims to provide a comprehensive update on the progress made on the Sequential Testing problem (STP) in the last 20 years after the review, [1] was published. Many studies have provided new theoretical results, extensions of the…

Data Structures and Algorithms · Computer Science 2025-11-21 Tonguç Ünlüyurt

Sequential Hypothesis Test with Online Usage-Constrained Sensor Selection

This work investigates the sequential hypothesis testing problem with online sensor selection and sensor usage constraints. That is, in a sensor network, the fusion center sequentially acquires samples by selecting one "most informative"…

Applications · Statistics 2016-01-26 Shang Li , Xiaoou Li , Xiaodong Wang , Jingchen Liu

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid*…

Statistics Theory · Mathematics 2019-07-18 Ramesh Johari , Leo Pekelis , David J. Walsh

Anytime-Valid Linear Models and Regression Adjusted Causal Inference in Randomized Experiments

Linear models are foundational tools in statistics and ubiquitous across the applied sciences. However, conventional statistical inference -- such as $t$-tests and $F$-tests -- are only valid at fixed sample sizes, making them unsuitable…

Methodology · Statistics 2025-07-08 Michael Lindon , Dae Woong Ham , Martin Tingley , Iavor Bojinov

Online Learning for Non-Stationary A/B Tests

The rollout of new versions of a feature in modern applications is a manual multi-stage process, as the feature is released to ever larger groups of users, while its performance is carefully monitored. This kind of A/B testing is…

Machine Learning · Computer Science 2018-05-29 Andrés Muñoz Medina , Sergei Vassilvitskii , Dong Yin

Beyond A/B Testing: Sequential Randomization for Developing Interventions in Scaled Digital Learning Environments

Randomized experiments ensure robust causal inference that are critical to effective learning analytics research and practice. However, traditional randomized experiments, like A/B tests, are limiting in large scale digital learning…

Applications · Statistics 2019-02-04 Timothy NeCamp , Josh Gardner , Christopher Brooks

STEB: In Search of the Best Evaluation Approach for Synthetic Time Series

The growing need for synthetic time series, due to data augmentation or privacy regulations, has led to numerous generative models, frameworks, and evaluation measures alike. Objectively comparing these measures on a large scale remains an…

Machine Learning · Computer Science 2025-05-28 Michael Stenger , Robert Leppich , André Bauer , Samuel Kounev