English
Related papers

Related papers: Learning Kernel Tests Without Data Splitting

200 papers

We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD), by adapting over the set of kernels used in defining it. For finite sets, this reduces to combining (normalised) MMD…

Machine Learning · Statistics 2023-10-31 Felix Biggs , Antonin Schrab , Arthur Gretton

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each…

Machine Learning · Statistics 2025-03-26 Alessandro Barp , Carl-Johann Simon-Gabriel , Mark Girolami , Lester Mackey

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space…

Methodology · Statistics 2025-08-21 Rui Cui , Yuhao Li , Xiaojun Song

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…

Statistics Theory · Mathematics 2026-04-09 Perrine Lacroix , Bertrand Michel , Franck Picard , Vincent Rivoirard

The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus…

Methodology · Statistics 2025-09-16 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables, such that the marginal distribution of the observed variables is…

Machine Learning · Statistics 2023-05-10 Heishiro Kanagawa , Wittawat Jitkrittum , Lester Mackey , Kenji Fukumizu , Arthur Gretton

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

The kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power…

Methodology · Statistics 2024-09-06 Anirban Chatterjee , Bhaswar B. Bhattacharya

The Maximum Mean Discrepancy (MMD) is a cornerstone statistic for nonparametric two-sample testing, but its test power is dictated entirely by the chosen kernel. Because any fixed kernel inherently fails to distinguish certain…

Machine Learning · Statistics 2026-05-11 Yijin Ni , Xiaoming Huo

Approximate Markov chain Monte Carlo (MCMC) offers the promise of more rapid sampling at the cost of more biased inference. Since standard MCMC diagnostics fail to detect these biases, researchers have developed computable Stein discrepancy…

Machine Learning · Statistics 2020-10-16 Jackson Gorham , Lester Mackey

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with…

Machine Learning · Statistics 2025-12-30 Antonin Schrab

Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…

Machine Learning · Statistics 2026-04-21 Antoine Chatalic , Marco Letizia , Nicolas Schreuder , Lorenzo Rosasco

Much of machine learning relies on comparing distributions with discrepancy measures. Stein's method creates discrepancy measures between two distributions that require only the unnormalized density of one and samples from the other. Stein…

Machine Learning · Statistics 2020-07-21 Raghav Singhal , Xintian Han , Saad Lahlou , Rajesh Ranganath

We introduce the Kernel Calibration Conditional Stein Discrepancy test (KCCSD test), a non-parametric, kernel-based test for assessing the calibration of probabilistic models with well-defined scores. In contrast to previous methods, our…

Machine Learning · Statistics 2025-10-17 Pierre Glaser , David Widmann , Fredrik Lindsten , Arthur Gretton

We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional differences, because failure to reject the null…

Machine Learning · Statistics 2026-03-17 Xing Liu , Axel Gandy

Kernelized Stein discrepancy (KSD) is a score-based discrepancy widely used in goodness-of-fit tests. It can be applied even when the target distribution has an unknown normalising factor, such as in Bayesian analysis. We show theoretically…

Machine Learning · Statistics 2023-06-06 Xing Liu , Andrew B. Duncan , Axel Gandy

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize…

Machine Learning · Statistics 2024-02-27 Xiuyuan Cheng , Yao Xie
‹ Prev 1 2 3 10 Next ›