English
Related papers

Related papers: Variable Selection for Kernel Two-Sample Tests

200 papers

We study two-sample variable selection: identifying variables that discriminate between the distributions of two sets of data vectors. Such variables help scientists understand the mechanisms behind dataset discrepancies. Although…

Machine Learning · Statistics 2025-11-06 Kensuke Mitsuzawa , Motonobu Kanagawa , Stefano Bortoli , Margherita Grossi , Paolo Papotti

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space…

Methodology · Statistics 2025-08-21 Rui Cui , Yuhao Li , Xiaojun Song

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus…

Methodology · Statistics 2025-09-16 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD), by adapting over the set of kernels used in defining it. For finite sets, this reduces to combining (normalised) MMD…

Machine Learning · Statistics 2023-10-31 Felix Biggs , Antonin Schrab , Arthur Gretton

Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…

Machine Learning · Statistics 2026-04-21 Antoine Chatalic , Marco Letizia , Nicolas Schreuder , Lorenzo Rosasco

The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches…

Methodology · Statistics 2026-02-24 Anirban Chatterjee , Aaditya Ramdas

The Maximum Mean Discrepancy (MMD) is a cornerstone statistic for nonparametric two-sample testing, but its test power is dictated entirely by the chosen kernel. Because any fixed kernel inherently fails to distinguish certain…

Machine Learning · Statistics 2026-05-11 Yijin Ni , Xiaoming Huo

Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…

Statistics Theory · Mathematics 2026-04-09 Perrine Lacroix , Bertrand Michel , Franck Picard , Vincent Rivoirard

We introduce a kernel-based two-sample test for comparing probability distributions up to group actions. Our construction yields invariant kernels for locally compact $\sigma$-compact groups and extends classical Haar-based approaches…

Statistics Theory · Mathematics 2026-03-18 Madison Giacofci , Anouar Meynaoui , Alex Podgorny

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics. While…

Machine Learning · Computer Science 2020-10-20 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

We propose two novel nonparametric two-sample kernel tests based on the Maximum Mean Discrepancy (MMD). First, for a fixed kernel, we construct an MMD test using either permutations or a wild bootstrap, two popular numerical procedures to…

Machine Learning · Statistics 2023-08-22 Antonin Schrab , Ilmun Kim , Mélisande Albert , Béatrice Laurent , Benjamin Guedj , Arthur Gretton

The Maximum Mean Discrepancy (MMD) has been the state-of-the-art nonparametric test for tackling the two-sample problem. Its statistic is given by the difference in expectations of the witness function, a real-valued function defined as a…

Machine Learning · Computer Science 2022-02-14 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize…

Machine Learning · Statistics 2024-02-27 Xiuyuan Cheng , Yao Xie

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where…

Methodology · Statistics 2022-01-25 Qinyi Zhang , Veit Wild , Sarah Filippi , Seth Flaxman , Dino Sejdinovic

The kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power…

Methodology · Statistics 2024-09-06 Anirban Chatterjee , Bhaswar B. Bhattacharya

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with…

Machine Learning · Statistics 2025-12-30 Antonin Schrab

Nonparametric two-sample tests such as the Maximum Mean Discrepancy (MMD) are often used to detect differences between two distributions in machine learning applications. However, the majority of existing literature assumes that error-free…

Machine Learning · Statistics 2023-08-08 Ron Nafshi , Maggie Makar
‹ Prev 1 2 3 10 Next ›