Related papers: A label-efficient two-sample test

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Active Sequential Two-Sample Testing

A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…

Machine Learning · Computer Science 2024-07-01 Weizhi Li , Prad Kadambi , Pouria Saidi , Karthikeyan Natesan Ramamurthy , Gautam Dasarathy , Visar Berisha

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

Revisiting Classifier Two-Sample Tests

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…

Machine Learning · Statistics 2018-03-14 David Lopez-Paz , Maxime Oquab

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Two-Sample Inference in Highly Dispersed Negative Binomial Models

Two-sample inference for the difference of population means typically relies upon a Central Limit Theorem approximation. When data are drawn from a Negative Binomial distribution, previous work of Shilane et al. (2010) showed that a Normal…

Methodology · Statistics 2012-03-06 David Shilane , Derek Bean

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

AutoML Two-Sample Test

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard…

Machine Learning · Computer Science 2023-01-18 Jonas M. Kübler , Vincent Stimper , Simon Buchholz , Krikamol Muandet , Bernhard Schölkopf

Two-cluster test

Cluster analysis is a fundamental research issue in statistics and machine learning. In many modern clustering methods, we need to determine whether two subsets of samples come from the same cluster. Since these subsets are usually…

Machine Learning · Computer Science 2025-07-15 Xinying Liu , Lianyu Hu , Mudi Jiang , Simeng Zhang , Jun Lou , Zengyou He

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

Deep models trained with noisy labels are prone to over-fitting and struggle in generalization. Most existing solutions are based on an ideal assumption that the label noise is class-conditional, i.e., instances of the same class share the…

Computer Vision and Pattern Recognition · Computer Science 2022-08-01 Ganlong Zhao , Guanbin Li , Yipeng Qin , Feng Liu , Yizhou Yu

Asymptotically Optimal Tests for One- and Two-Sample Problems

In this work, we revisit the one- and two-sample testing problems: binary hypothesis testing in which one or both distributions are unknown. For the one-sample test, we provide a more streamlined proof of the asymptotic optimality of…

Information Theory · Computer Science 2026-04-21 Arick Grootveld , Biao Chen , Venkata Gandikota

General Frameworks for Conditional Two-Sample Testing

We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors. This problem commonly arises in various applications, such as…

Machine Learning · Statistics 2026-05-05 Seongchan Lee , Suman Cha , Ilmun Kim

Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

Local Two-Sample Testing over Graphs and Point-Clouds by Random-Walk Distributions

Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how…

Statistics Theory · Mathematics 2021-09-08 Boris Landa , Rihao Qu , Joseph Chang , Yuval Kluger

A Bipartite Ranking Approach to the Two-Sample Problem

The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…

Statistics Theory · Mathematics 2023-02-09 Stephan Clémençon , Myrto Limnios , Nicolas Vayatis

A frequentist two-sample test based on Bayesian model selection

Despite their importance in supporting experimental conclusions, standard statistical tests are often inadequate for research areas, like the life sciences, where the typical sample size is small and the test assumptions difficult to…

Methodology · Statistics 2011-04-15 Pietro Berkes , Jozsef Fiser

Graph-Based Tests for Two-Sample Comparisons of Categorical Data

We study the problem of two-sample comparison with categorical data when the contingency table is sparsely populated. In modern applications, the number of categories is often comparable to the sample size, causing existing methods to have…

Methodology · Statistics 2014-08-14 Hao Chen , Nancy R. Zhang

Fast Two-Sample Testing with Analytic Representations of Probability Measures

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses…

Machine Learning · Statistics 2015-06-16 Kacper Chwialkowski , Aaditya Ramdas , Dino Sejdinovic , Arthur Gretton

Practical methods for graph two-sample testing

Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies…

Machine Learning · Statistics 2018-12-03 Debarghya Ghoshdastidar , Ulrike von Luxburg

Two-sample Bayesian nonparametric goodness-of-fit test

In recent years, Bayesian nonparametric statistics has gathered extraordinary attention. Nonetheless, a relatively little amount of work has been expended on Bayesian nonparametric hypothesis testing. In this paper, a novel Bayesian…

Statistics Theory · Mathematics 2015-05-08 Luai Al Labadi , Emad Masuadi , Mahmoud Zarepour