Related papers: A Kernel Method for the Two-Sample Problem

Two-sample inference for sparse functional data

We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when…

Methodology · Statistics 2025-01-29 Chi Zhang , Peijun Sang , Yingli Qin

Fast Two-Sample Testing with Analytic Representations of Probability Measures

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses…

Machine Learning · Statistics 2015-06-16 Kacper Chwialkowski , Aaditya Ramdas , Dino Sejdinovic , Arthur Gretton

A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference)

In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed…

Methodology · Statistics 2024-08-30 Anirban Chatterjee , Ziang Niu , Bhaswar B. Bhattacharya

Kernel based method for the $k$-sample problem

In this paper we deal with the problem of testing for the equality of $k$ probability distributions defined on $(\mathcal{X},\mathcal{B})$, where $\mathcal{X}$ is a metric space and $\mathcal{B}$ is the corresponding Borel $\sigma$-field.…

Statistics Theory · Mathematics 2018-12-04 Armando Sosthene Kali Balogoun , Guy Martial Nkiet , Carlos Ogouyandjou

Generalized Kernel Two-Sample Tests

Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do…

Methodology · Statistics 2023-11-21 Hoseung Song , Hao Chen

A Differentially Private Kernel Two-Sample Test

Kernel two-sample testing is a useful statistical tool in determining whether data samples arise from different distributions without imposing any parametric assumptions on those distributions. However, raw data samples can expose sensitive…

Machine Learning · Statistics 2018-08-02 Anant Raj , Ho Chung Leon Law , Dino Sejdinovic , Mijung Park

Bayesian Kernel Two-Sample Testing

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where…

Methodology · Statistics 2022-01-25 Qinyi Zhang , Veit Wild , Sarah Filippi , Seth Flaxman , Dino Sejdinovic

Spectral Regularized Kernel Two-Sample Tests

Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of…

Statistics Theory · Mathematics 2024-05-03 Omar Hagrass , Bharath K. Sriperumbudur , Bing Li

Comparing distributions: $\ell_1$ geometry improves kernel two-sample testing

Are two sets of observations drawn from the same distribution? This problem is a two-sample test. Kernel methods lead to many appealing properties. Indeed state-of-the-art approaches use the $L^2$ distance between kernel-based distribution…

Machine Learning · Statistics 2019-10-02 M. Scetbon , G. Varoquaux

Two-sample Statistics Based on Anisotropic Kernels

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

A Kernel Two-Sample Test for Functional Data

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

A uniform kernel trick for high-dimensional two-sample problems

We use a suitable version of the so-called "kernel trick" to devise two-sample (homogeneity) tests, especially focussed on high-dimensional and functional data. Our proposal entails a simplification related to the important practical…

Statistics Theory · Mathematics 2024-04-24 Javier Cárcamo , Antonio Cuevas , Luis-Alberto Rodríguez

Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

Kernel Two-Sample Testing via Directional Components Analysis

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space…

Methodology · Statistics 2025-08-21 Rui Cui , Yuhao Li , Xiaojun Song

Hypothesis testing using pairwise distances and associated kernels (with Appendix)

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, distances between…

Machine Learning · Computer Science 2015-03-20 Dino Sejdinovic , Arthur Gretton , Bharath Sriperumbudur , Kenji Fukumizu

A Uniform Concentration Inequality for Kernel-Based Two-Sample Statistics

In many contemporary statistical and machine learning methods, one needs to optimize an objective function that depends on the discrepancy between two probability distributions. The discrepancy can be referred to as a metric for…

Machine Learning · Computer Science 2025-02-11 Yijin Ni , Xiaoming Huo

Minimax Optimal Kernel Two-Sample Tests with Random Features

Reproducing Kernel Hilbert Space (RKHS) embedding of probability distributions has proved to be an effective approach, via MMD (maximum mean discrepancy), for nonparametric hypothesis testing problems involving distributions defined over…

Statistics Theory · Mathematics 2025-10-17 Soumya Mukherjee , Bharath K. Sriperumbudur

Kernel K-means clustering of distributional data

We consider the problem of clustering a sample of probability distributions from a random distribution on $\mathbb R^p$. Our proposed partitioning method makes use of a symmetric, positive-definite kernel $k$ and its associated reproducing…

Machine Learning · Statistics 2025-09-23 Amparo Baíllo , Jose R. Berrendero , Martín Sánchez-Signorini

Learning Deep Kernels for Non-Parametric Two-Sample Tests

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

Two-sample Test with Kernel Projected Wasserstein Distance

We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method…

Statistics Theory · Mathematics 2022-05-10 Jie Wang , Rui Gao , Yao Xie