Related papers: Active Sequential Two-Sample Testing

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

General Frameworks for Conditional Two-Sample Testing

We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors. This problem commonly arises in various applications, such as…

Machine Learning · Statistics 2026-05-05 Seongchan Lee , Suman Cha , Ilmun Kim

A frequentist two-sample test based on Bayesian model selection

Despite their importance in supporting experimental conclusions, standard statistical tests are often inadequate for research areas, like the life sciences, where the typical sample size is small and the test assumptions difficult to…

Methodology · Statistics 2011-04-15 Pietro Berkes , Jozsef Fiser

Instance-Based Classification through Hypothesis Testing

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Active Testing: Sample-Efficient Model Evaluation

We introduce a new framework for sample-efficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of…

Machine Learning · Statistics 2021-06-15 Jannik Kossen , Sebastian Farquhar , Yarin Gal , Tom Rainforth

A New Step-down Procedure for Simultaneous Hypothesis Testing Under Dependence

In this article, we consider the problem of simultaneous testing of hypotheses when the individual test statistics are not necessarily independent. Specifically, we consider the problem of simultaneous testing of point null hypotheses…

Statistics Theory · Mathematics 2018-07-17 Prasenjit Ghosh , Arijit Chakrabarti

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

Local Two-Sample Testing over Graphs and Point-Clouds by Random-Walk Distributions

Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how…

Statistics Theory · Mathematics 2021-09-08 Boris Landa , Rihao Qu , Joseph Chang , Yuval Kluger

Revisiting Classifier Two-Sample Tests

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…

Machine Learning · Statistics 2018-03-14 David Lopez-Paz , Maxime Oquab

Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

Kernel Hypothesis Testing with Set-valued Data

We present a general framework for hypothesis testing on distributions of sets of individual examples. Sets may represent many common data sources such as groups of observations in time series, collections of words in text or a batch of…

Methodology · Statistics 2021-02-03 Alexis Bellot , Mihaela van der Schaar

Sequential multiple hypothesis testing in presence of control variables

Suppose that at any stage of a statistical experiment a control variable $X$ that affects the distribution of the observed data $Y$ at this stage can be used. The distribution of $Y$ depends on some unknown parameter $\theta$, and we…

Statistics Theory · Mathematics 2009-12-23 Andrey Novikov

Reliability of Sequential Hypothesis Testing Can Be Achieved by an Almost-Fixed-Length Test

The maximum type-I and type-II error exponents associated with the newly introduced almost-fixed-length hypothesis testing is characterized. In this class of tests, the decision-maker declares the true hypothesis almost always after…

Information Theory · Computer Science 2016-05-18 Anusha Lalitha , Tara Javidi

Sequential Predictive Two-Sample and Independence Testing

We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data,…

Machine Learning · Statistics 2023-07-21 Aleksandr Podkopaev , Aaditya Ramdas

AutoML Two-Sample Test

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard…

Machine Learning · Computer Science 2023-01-18 Jonas M. Kübler , Vincent Stimper , Simon Buchholz , Krikamol Muandet , Bernhard Schölkopf

Comparing Generative Models with the New Physics Learning Machine

The rise of generative models for scientific research calls for the development of new methods to evaluate their fidelity. A natural framework for addressing this problem is two-sample hypothesis testing, namely the task of determining…

Machine Learning · Statistics 2025-08-05 Samuele Grossi , Marco Letizia , Riccardo Torre

Two-sample testing in non-sparse high-dimensional linear models

In analyzing high-dimensional models, sparsity of the model parameter is a common but often undesirable assumption. In this paper, we study the following two-sample testing problem: given two samples generated by two high-dimensional linear…

Statistics Theory · Mathematics 2017-08-16 Yinchu Zhu , Jelena Bradic