Related papers: Revisiting Classifier Two-Sample Tests

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

E-Valuating Classifier Two-Sample Tests

We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and…

Methodology · Statistics 2024-05-01 Teodora Pandeva , Tim Bakker , Christian A. Naesseth , Patrick Forré

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

Classification Logit Two-sample Testing by Neural Networks

The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that…

Machine Learning · Statistics 2022-06-01 Xiuyuan Cheng , Alexander Cloninger

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming…

Machine Learning · Statistics 2022-01-06 Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

Instance-Based Classification through Hypothesis Testing

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Two-sample test based on Self-Organizing Maps

Machine-learning classifiers can be leveraged as a two-sample statistical test. Suppose each sample is assigned a different label and that a classifier can obtain a better-than-chance result discriminating them. In this case, we can infer…

Machine Learning · Computer Science 2022-12-20 Alejandro Álvarez-Ayllón , Manuel Palomo-Duarte , Juan-Manuel Dodero

Active Sequential Two-Sample Testing

A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…

Machine Learning · Computer Science 2024-07-01 Weizhi Li , Prad Kadambi , Pouria Saidi , Karthikeyan Natesan Ramamurthy , Gautam Dasarathy , Visar Berisha

Learning Deep Kernels for Non-Parametric Two-Sample Tests

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Two-cluster test

Cluster analysis is a fundamental research issue in statistics and machine learning. In many modern clustering methods, we need to determine whether two subsets of samples come from the same cluster. Since these subsets are usually…

Machine Learning · Computer Science 2025-07-15 Xinying Liu , Lianyu Hu , Mudi Jiang , Simeng Zhang , Jun Lou , Zengyou He

General Frameworks for Conditional Two-Sample Testing

We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors. This problem commonly arises in various applications, such as…

Machine Learning · Statistics 2026-05-05 Seongchan Lee , Suman Cha , Ilmun Kim

Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions

A number of applications require two-sample testing on ranked preference data. For instance, in crowdsourcing, there is a long-standing question of whether pairwise comparison data provided by people is distributed similar to…

Machine Learning · Statistics 2020-11-20 Charvi Rastogi , Sivaraman Balakrishnan , Nihar B. Shah , Aarti Singh

Two-Sample Testing in High-Dimensional Models

We propose novel methodology for testing equality of model parameters between two high-dimensional populations. The technique is very general and applicable to a wide range of models. The method is based on sample splitting: the data is…

Methodology · Statistics 2013-01-17 Nicolas Städler , Sach Mukherjee

Collaborative non-parametric two-sample testing

This paper addresses the multiple two-sample test problem in a graph-structured setting, which is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem…

Machine Learning · Statistics 2024-02-09 Alejandro de la Concha , Nicolas Vayatis , Argyris Kalogeratos

Sequential Classification with Empirically Observed Statistics

Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of…

Machine Learning · Statistics 2021-02-11 Mahdi Haghifam , Vincent Y. F. Tan , Ashish Khisti

A frequentist two-sample test based on Bayesian model selection

Despite their importance in supporting experimental conclusions, standard statistical tests are often inadequate for research areas, like the life sciences, where the typical sample size is small and the test assumptions difficult to…

Methodology · Statistics 2011-04-15 Pietro Berkes , Jozsef Fiser