Related papers: MMD Aggregated Two-Sample Test

A Witness Two-Sample Test

The Maximum Mean Discrepancy (MMD) has been the state-of-the-art nonparametric test for tackling the two-sample problem. Its statistic is given by the difference in expectations of the witness function, a real-valued function defined as a…

Machine Learning · Computer Science 2022-02-14 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

Boosting the Power of Kernel Two-Sample Tests

The kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power…

Methodology · Statistics 2024-09-06 Anirban Chatterjee , Bhaswar B. Bhattacharya

Non-asymptotic two-sample kernel testing with the spectrally truncated normalized MMD

Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…

Statistics Theory · Mathematics 2026-04-09 Perrine Lacroix , Bertrand Michel , Franck Picard , Vincent Rivoirard

Partial identification of kernel based two sample tests with mismeasured data

Nonparametric two-sample tests such as the Maximum Mean Discrepancy (MMD) are often used to detect differences between two distributions in machine learning applications. However, the majority of existing literature assumes that error-free…

Machine Learning · Statistics 2023-08-08 Ron Nafshi , Maggie Makar

MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without Data Splitting

We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD), by adapting over the set of kernels used in defining it. For finite sets, this reduces to combining (normalised) MMD…

Machine Learning · Statistics 2023-10-31 Felix Biggs , Antonin Schrab , Arthur Gretton

A Kernel Two-Sample Test for Functional Data

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

A Permutation-free Kernel Two-Sample Test

The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus…

Methodology · Statistics 2025-09-16 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

A Scalable Nystrom-Based Kernel Two-Sample Test with Permutations

Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…

Machine Learning · Statistics 2026-04-21 Antoine Chatalic , Marco Letizia , Nicolas Schreuder , Lorenzo Rosasco

A Martingale Kernel Two-Sample Test

The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches…

Methodology · Statistics 2026-02-24 Anirban Chatterjee , Aaditya Ramdas

Variable Selection for Kernel Two-Sample Tests

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

Spectral Regularized Kernel Two-Sample Tests

Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of…

Statistics Theory · Mathematics 2024-05-03 Omar Hagrass , Bharath K. Sriperumbudur , Bing Li

Maximum Mean Discrepancy with Unequal Sample Sizes via Generalized U-Statistics

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require…

Machine Learning · Statistics 2025-12-17 Aaron Wei , Milad Jalali , Danica J. Sutherland

Two-sample Statistics Based on Anisotropic Kernels

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

MMD Two-sample Testing in the Presence of Arbitrarily Missing Data

In many real-world applications, it is common that a proportion of the data may be missing or only partially observed. We develop a novel two-sample testing method based on the Maximum Mean Discrepancy (MMD) which accounts for missing data…

Methodology · Statistics 2024-05-27 Yijin Zeng , Niall M. Adams , Dean A. Bodenham

Computational-Statistical Trade-off in Kernel Two-Sample Testing with Random Fourier Features

Recent years have seen a surge in methods for two-sample testing, among which the Maximum Mean Discrepancy (MMD) test has emerged as an effective tool for handling complex and high-dimensional data. Despite its success and widespread…

Machine Learning · Statistics 2026-05-21 Ikjun Choi , Ilmun Kim

FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test

The maximum mean discrepancy (MMD) is a recently proposed test statistic for two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this…

Artificial Intelligence · Computer Science 2015-06-19 Ji Zhao , Deyu Meng

Kernel Two-Sample Tests for Manifold Data

We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize…

Machine Learning · Statistics 2024-02-27 Xiuyuan Cheng , Yao Xie

A Kernel Two-Sample Test Invariant under Group Action with Applications to Functional Data

We introduce a kernel-based two-sample test for comparing probability distributions up to group actions. Our construction yields invariant kernels for locally compact $\sigma$-compact groups and extends classical Haar-based approaches…

Statistics Theory · Mathematics 2026-03-18 Madison Giacofci , Anouar Meynaoui , Alex Podgorny

Signature Maximum Mean Discrepancy Two-Sample Statistical Tests

Maximum Mean Discrepancy (MMD) is a widely used concept in machine learning research which has gained popularity in recent years as a highly effective tool for comparing (finite-dimensional) distributions. Since it is designed as a…

Machine Learning · Statistics 2025-06-03 Andrew Alden , Blanka Horvath , Zacharia Issa

Two Sample Testing in High Dimension via Maximum Mean Discrepancy

Maximum Mean Discrepancy (MMD) has been widely used in the areas of machine learning and statistics to quantify the distance between two distributions in the $p$-dimensional Euclidean space. The asymptotic property of the sample MMD has…

Statistics Theory · Mathematics 2023-08-29 Hanjia Gao , Xiaofeng Shao