Related papers: Two-Sample Test Based on Classification Probabilit…

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Classification Logit Two-sample Testing by Neural Networks

The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that…

Machine Learning · Statistics 2022-06-01 Xiuyuan Cheng , Alexander Cloninger

E-Valuating Classifier Two-Sample Tests

We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and…

Methodology · Statistics 2024-05-01 Teodora Pandeva , Tim Bakker , Christian A. Naesseth , Patrick Forré

Instance-Based Classification through Hypothesis Testing

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

On the Use of Random Forest for Two-Sample Testing

Following the line of classification-based two-sample testing, tests based on the Random Forest classifier are proposed. The developed tests are easy to use, require almost no tuning, and are applicable for any distribution on…

Methodology · Statistics 2021-05-07 Simon Hediger , Loris Michel , Jeffrey Näf

A Bipartite Ranking Approach to the Two-Sample Problem

The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…

Statistics Theory · Mathematics 2023-02-09 Stephan Clémençon , Myrto Limnios , Nicolas Vayatis

Exact and efficient multivariate two-sample tests through generalized linear rank statistics

So-called linear rank statistics provide a means for distribution-free (even in finite samples), yet highly flexible, two-sample testing in the setting of univariate random variables. Their flexibility derives from a choice of weights that…

Methodology · Statistics 2023-10-03 Dan D. Erdmann-Pham

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

Revisiting Classifier Two-Sample Tests

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…

Machine Learning · Statistics 2018-03-14 David Lopez-Paz , Maxime Oquab

On an Exact and Nonparametric Test for the Separability of Two Classes by Means of a Simple Threshold

This paper introduces a statistical test inferring whether a variable allows separating two classes by means of a single critical value. Its test statistic is the prediction error of a nonparametric threshold classifier. While this approach…

Methodology · Statistics 2017-07-17 Fabian Schroeder

A Two-Sample Conditional Distribution Test Using Conformal Prediction and Weighted Rank Sum

We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and…

Methodology · Statistics 2023-02-24 Xiaoyu Hu , Jing Lei

A frequentist two-sample test based on Bayesian model selection

Despite their importance in supporting experimental conclusions, standard statistical tests are often inadequate for research areas, like the life sciences, where the typical sample size is small and the test assumptions difficult to…

Methodology · Statistics 2011-04-15 Pietro Berkes , Jozsef Fiser

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions

A number of applications require two-sample testing on ranked preference data. For instance, in crowdsourcing, there is a long-standing question of whether pairwise comparison data provided by people is distributed similar to…

Machine Learning · Statistics 2020-11-20 Charvi Rastogi , Sivaraman Balakrishnan , Nihar B. Shah , Aarti Singh

Statistical divergences in high-dimensional hypothesis testing and a modern technique for estimating them

Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between…

Data Analysis, Statistics and Probability · Physics 2024-08-02 Jeremy J. H. Wilkinson , Christopher G. Lester

An Approximation-based Approach for the Random Exploration of Large Models

System modeling is a classical approach to ensure their reliability since it is suitable both for a formal verification and for software testing techniques. In the context of model-based testing an approach combining random testing and…

Software Engineering · Computer Science 2018-06-14 Julien Bernard , Pierre-Cyrille Héam , Olga Kouchnarenko

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Active Sequential Two-Sample Testing

A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…

Machine Learning · Computer Science 2024-07-01 Weizhi Li , Prad Kadambi , Pouria Saidi , Karthikeyan Natesan Ramamurthy , Gautam Dasarathy , Visar Berisha

Two-Sample Testing in High-Dimensional Models

We propose novel methodology for testing equality of model parameters between two high-dimensional populations. The technique is very general and applicable to a wide range of models. The method is based on sample splitting: the data is…

Methodology · Statistics 2013-01-17 Nicolas Städler , Sach Mukherjee

A new class of robust two-sample Wald-type tests

Parametric hypothesis testing associated with two independent samples arises frequently in several applications in biology, medical sciences, epidemiology, reliability and many more. In this paper, we propose robust Wald-type tests for…

Methodology · Statistics 2019-05-09 Abhik Ghosh , Nirian Martin , Ayanendranath Basu , Leandro Pardo