Related papers: AutoML Two-Sample Test

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

A Scalable Nystrom-Based Kernel Two-Sample Test with Permutations

Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…

Machine Learning · Statistics 2026-04-21 Antoine Chatalic , Marco Letizia , Nicolas Schreuder , Lorenzo Rosasco

Machine Learning for Two-Sample Testing under Right-Censored Data: A Simulation Study

The focus of this study is to evaluate the effectiveness of Machine Learning (ML) methods for two-sample testing with right-censored observations. To achieve this, we develop several ML-based methods with varying architectures and implement…

Machine Learning · Computer Science 2024-09-27 Petr Philonenko , Sergey Postovalov

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming…

Machine Learning · Statistics 2022-01-06 Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

MATES: Multi-view Aggregated Two-Sample Test

The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional…

Methodology · Statistics 2024-12-24 Zexi Cai , Wenbo Fei , Doudou Zhou

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Active Sequential Two-Sample Testing

A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…

Machine Learning · Computer Science 2024-07-01 Weizhi Li , Prad Kadambi , Pouria Saidi , Karthikeyan Natesan Ramamurthy , Gautam Dasarathy , Visar Berisha

Two-Sample Test for Stochastic Block Models via Maximum Entry-wise Deviation

The stochastic block model is a popular tool for detecting community structures in network data. Detecting the difference between two community structures is an important issue for stochastic block models. However, the two-sample test has…

Methodology · Statistics 2022-12-21 Kang Fu , Jianwei Hu , Seydou Keita , Hao Liu

Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions

A number of applications require two-sample testing on ranked preference data. For instance, in crowdsourcing, there is a long-standing question of whether pairwise comparison data provided by people is distributed similar to…

Machine Learning · Statistics 2020-11-20 Charvi Rastogi , Sivaraman Balakrishnan , Nihar B. Shah , Aarti Singh

Statistical divergences in high-dimensional hypothesis testing and a modern technique for estimating them

Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between…

Data Analysis, Statistics and Probability · Physics 2024-08-02 Jeremy J. H. Wilkinson , Christopher G. Lester

The AUGUST Two-Sample Test: Powerful, Interpretable, and Fast

Two-sample testing is a fundamental problem in statistics, and many famous two-sample tests are designed to be fully non-parametric. These existing methods perform well with location and scale shifts but are less robust when faced with more…

Methodology · Statistics 2021-10-12 Benjamin Brown , Kai Zhang

A novel two-sample test within the space of symmetric positive definite matrix distributions and its application in finance

This paper introduces a novel two-sample test for a broad class of orthogonally equivalent positive definite symmetric matrix distributions. Our test is the first of its kind and we derive its asymptotic distribution. To estimate the test…

Methodology · Statistics 2023-08-15 Žikica Lukić , Bojana Milošević

MMD Two-sample Testing in the Presence of Arbitrarily Missing Data

In many real-world applications, it is common that a proportion of the data may be missing or only partially observed. We develop a novel two-sample testing method based on the Maximum Mean Discrepancy (MMD) which accounts for missing data…

Methodology · Statistics 2024-05-27 Yijin Zeng , Niall M. Adams , Dean A. Bodenham

Multivariate two-sample test statistics based on data depth

Data depth has been applied as a nonparametric measurement for ranking multivariate samples. In this paper, we focus on homogeneity tests to assess whether two multivariate samples are from the same distribution. There are many data…

Statistics Theory · Mathematics 2023-06-09 Yiting Chen , Wei Lin , Xiaoping Shi

A Kernel Two-sample Test for Dynamical Systems

Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems. This is particularly relevant for data generated by dynamical systems since such systems are essential for many…

Machine Learning · Statistics 2022-09-07 Friedrich Solowjow , Dominik Baumann , Christian Fiedler , Andreas Jocham , Thomas Seel , Sebastian Trimpe

Robust Tests for the Equality of Two Normal Means based on the Density Power Divergence

Statistical techniques are used in all branches of science to determine the feasibility of quantitative hypotheses. One of the most basic applications of statistical techniques in comparative analysis is the test of equality of two…

Methodology · Statistics 2018-05-01 Ayanendranath Basu , Abhijit Mandal , Nirian Martin , Leandro Pardo

Fast Two-Sample Testing with Analytic Representations of Probability Measures

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses…

Machine Learning · Statistics 2015-06-16 Kacper Chwialkowski , Aaditya Ramdas , Dino Sejdinovic , Arthur Gretton