Related papers: Kernel Two-Sample Testing via Directional Componen…
Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…
We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…
Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of…
The kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power…
We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…
The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…
We introduce a kernel-based two-sample test for comparing probability distributions up to group actions. Our construction yields invariant kernels for locally compact $\sigma$-compact groups and extends classical Haar-based approaches…
Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…
Reproducing Kernel Hilbert Space (RKHS) embedding of probability distributions has proved to be an effective approach, via MMD (maximum mean discrepancy), for nonparametric hypothesis testing problems involving distributions defined over…
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize…
Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics. While…
The Maximum Mean Discrepancy (MMD) has been the state-of-the-art nonparametric test for tackling the two-sample problem. Its statistic is given by the difference in expectations of the witness function, a real-valued function defined as a…
We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD), by adapting over the set of kernels used in defining it. For finite sets, this reduces to combining (normalised) MMD…
Two-sample and independence tests with the kernel-based MMD and HSIC have shown remarkable results on i.i.d. data and stationary random processes. However, these statistics are not directly applicable to non-stationary random processes, a…
Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do…
In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where…
We propose a new one-sample test for normality in a Reproducing Kernel Hilbert Space (RKHS). Namely, we test the null-hypothesis of belonging to a given family of Gaussian distributions. Hence our procedure may be applied either to test…
Do two data samples come from different distributions? Recent studies of this fundamental problem focused on embedding probability distributions into sufficiently rich characteristic Reproducing Kernel Hilbert Spaces (RKHSs), to compare…
Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…
Two-sample tests have been extensively employed in various scientific fields and machine learning such as evaluation on the effectiveness of drugs and A/B testing on different marketing strategies to discriminate whether two sets of samples…