English
Related papers

Related papers: A Scalable Nystrom-Based Kernel Two-Sample Test wi…

200 papers

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus…

Methodology · Statistics 2025-09-16 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

Maximum Mean Discrepancy (MMD) has been widely used in the areas of machine learning and statistics to quantify the distance between two distributions in the $p$-dimensional Euclidean space. The asymptotic property of the sample MMD has…

Statistics Theory · Mathematics 2023-08-29 Hanjia Gao , Xiaofeng Shao

The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches…

Methodology · Statistics 2026-02-24 Anirban Chatterjee , Aaditya Ramdas

Nonparametric two-sample tests such as the Maximum Mean Discrepancy (MMD) are often used to detect differences between two distributions in machine learning applications. However, the majority of existing literature assumes that error-free…

Machine Learning · Statistics 2023-08-08 Ron Nafshi , Maggie Makar

Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…

Statistics Theory · Mathematics 2026-04-09 Perrine Lacroix , Bertrand Michel , Franck Picard , Vincent Rivoirard

Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of…

Statistics Theory · Mathematics 2024-05-03 Omar Hagrass , Bharath K. Sriperumbudur , Bing Li

The Maximum Mean Discrepancy (MMD) has been the state-of-the-art nonparametric test for tackling the two-sample problem. Its statistic is given by the difference in expectations of the witness function, a real-valued function defined as a…

Machine Learning · Computer Science 2022-02-14 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require…

Machine Learning · Statistics 2025-12-17 Aaron Wei , Milad Jalali , Danica J. Sutherland

Maximum Mean Discrepancy (MMD) is a widely used concept in machine learning research which has gained popularity in recent years as a highly effective tool for comparing (finite-dimensional) distributions. Since it is designed as a…

Machine Learning · Statistics 2025-06-03 Andrew Alden , Blanka Horvath , Zacharia Issa

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD), by adapting over the set of kernels used in defining it. For finite sets, this reduces to combining (normalised) MMD…

Machine Learning · Statistics 2023-10-31 Felix Biggs , Antonin Schrab , Arthur Gretton

Recent years have seen a surge in methods for two-sample testing, among which the Maximum Mean Discrepancy (MMD) test has emerged as an effective tool for handling complex and high-dimensional data. Despite its success and widespread…

Machine Learning · Statistics 2026-05-21 Ikjun Choi , Ilmun Kim

Do two data samples come from different distributions? Recent studies of this fundamental problem focused on embedding probability distributions into sufficiently rich characteristic Reproducing Kernel Hilbert Spaces (RKHSs), to compare…

Machine Learning · Computer Science 2013-05-03 Somayeh Danafar , Paola M. V. Rancoita , Tobias Glasmachers , Kevin Whittingstall , Juergen Schmidhuber

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

In many real-world applications, it is common that a proportion of the data may be missing or only partially observed. We develop a novel two-sample testing method based on the Maximum Mean Discrepancy (MMD) which accounts for missing data…

Methodology · Statistics 2024-05-27 Yijin Zeng , Niall M. Adams , Dean A. Bodenham

We propose two novel nonparametric two-sample kernel tests based on the Maximum Mean Discrepancy (MMD). First, for a fixed kernel, we construct an MMD test using either permutations or a wild bootstrap, two popular numerical procedures to…

Machine Learning · Statistics 2023-08-22 Antonin Schrab , Ilmun Kim , Mélisande Albert , Béatrice Laurent , Benjamin Guedj , Arthur Gretton

Nonparametric two sample testing deals with the question of consistently deciding if two distributions are different, given samples from both, without making any parametric assumptions about the form of the distributions. The current…

Statistics Theory · Mathematics 2014-11-25 Aaditya Ramdas , Sashank J. Reddi , Barnabas Poczos , Aarti Singh , Larry Wasserman

The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties. Being able…

Machine Learning · Statistics 2022-11-16 Danica J. Sutherland , Namrata Deka
‹ Prev 1 2 3 10 Next ›