Related papers: Efficient Aggregated Kernel Tests using Incomplete…

Optimal Kernel Combination for Test of Independence against Local Alternatives

Testing the independence between two random variables $x$ and $y$ is an important problem in statistics and machine learning, where the kernel-based tests of independence is focused to address the study of dependence recently. The advantage…

Methodology · Statistics 2015-04-14 Wen-Yu Hua , Philip Reiss , Debashis Ghosh

A Permutation-Free Kernel Independence Test

In nonparametric independence testing, we observe i.i.d.\ data $\{(X_i,Y_i)\}_{i=1}^n$, where $X \in \mathcal{X}, Y \in \mathcal{Y}$ lie in any general spaces, and we wish to test the null that $X$ is independent of $Y$. Modern test…

Methodology · Statistics 2022-12-20 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

Kernel Two-Sample and Independence Tests for Non-Stationary Random Processes

Two-sample and independence tests with the kernel-based MMD and HSIC have shown remarkable results on i.i.d. data and stationary random processes. However, these statistics are not directly applicable to non-stationary random processes, a…

Methodology · Statistics 2021-01-05 Felix Laumann , Julius von Kügelgen , Mauricio Barahona

A Unified View of Optimal Kernel Hypothesis Testing

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with…

Machine Learning · Statistics 2025-12-30 Antonin Schrab

Universal Hypothesis Testing with Kernels: Asymptotically Optimal Tests for Goodness of Fit

We characterize the asymptotic performance of nonparametric goodness of fit testing. The exponential decay rate of the type-II error probability is used as the asymptotic performance metric, and a test is optimal if it achieves the maximum…

Machine Learning · Statistics 2019-03-19 Shengyu Zhu , Biao Chen , Pengfei Yang , Zhitang Chen

KSD Aggregated Goodness-of-fit Test

We investigate properties of goodness-of-fit tests based on the Kernel Stein Discrepancy (KSD). We introduce a strategy to construct a test, called KSDAgg, which aggregates multiple tests with different kernels. KSDAgg avoids splitting the…

Machine Learning · Statistics 2023-12-22 Antonin Schrab , Benjamin Guedj , Arthur Gretton

Kernel-based Tests for Joint Independence

We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but…

Statistics Theory · Mathematics 2016-11-07 Niklas Pfister , Peter Bühlmann , Bernhard Schölkopf , Jonas Peters

Learning Kernel Tests Without Data Splitting

Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics. While…

Machine Learning · Computer Science 2020-10-20 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

A fast and accurate kernel-based independence test with applications to high-dimensional and functional data

Testing the dependency between two random variables is an important inference problem in statistics since many statistical procedures rely on the assumption that the two samples are independent. To test whether two samples are independent,…

Methodology · Statistics 2023-01-04 Jin-Ting Zhang , Tianming Zhu

A Uniform Concentration Inequality for Kernel-Based Two-Sample Statistics

In many contemporary statistical and machine learning methods, one needs to optimize an objective function that depends on the discrepancy between two probability distributions. The discrepancy can be referred to as a metric for…

Machine Learning · Computer Science 2025-02-11 Yijin Ni , Xiaoming Huo

Testing Many Constraints in Possibly Irregular Models Using Incomplete U-Statistics

We consider the problem of testing a null hypothesis defined by equality and inequality constraints on a statistical parameter. Testing such hypotheses can be challenging because the number of relevant constraints may be on the same order…

Methodology · Statistics 2024-02-19 Nils Sturma , Mathias Drton , Dennis Leung

A Permutation-free Kernel Two-Sample Test

The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus…

Methodology · Statistics 2025-09-16 Shubhanshu Shekhar , Ilmun Kim , Aaditya Ramdas

A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD

This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these…

Machine Learning · Statistics 2025-11-03 Antonin Schrab

An Adaptive Test of Independence with Analytic Kernel Embeddings

A new computationally efficient dependence measure, and an adaptive statistical test of independence, are proposed. The dependence measure is the difference between analytic embeddings of the joint distribution and the product of the…

Machine Learning · Statistics 2016-10-18 Wittawat Jitkrittum , Zoltan Szabo , Arthur Gretton

A Martingale Kernel Independence Test

The Hilbert-Schmidt Independence Criterion (HSIC) and its joint-independence extension $d\mathrm{HSIC}$ are degenerate $V$-statistics whose data-dependent weighted-$\chi^2$ null limits force a permutation calibration that multiplies the…

Machine Learning · Statistics 2026-05-22 Felix Laumann , Zhaolu Liu , Mauricio Barahona

Randomized incomplete $U$-statistics in high dimensions

This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of Big Data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of…

Statistics Theory · Mathematics 2019-01-29 Xiaohui Chen , Kengo Kato

Computationally efficient permutation tests for the multivariate two-sample problem based on energy distance or maximum mean discrepancy statistics

Non-parametric two-sample tests based on energy distance or maximum mean discrepancy are widely used statistical tests for comparing multivariate data from two populations. While these tests enjoy desirable statistical properties, their…

Computation · Statistics 2024-06-11 Elias Chaibub Neto

A kernel log-rank test of independence for right-censored data

We introduce a general non-parametric independence test between right-censored survival times and covariates, which may be multivariate. Our test statistic has a dual interpretation, first in terms of the supremum of a potentially infinite…

Methodology · Statistics 2021-11-23 Tamara Fernandez , Arthur Gretton , David Rindt , Dino Sejdinovic

Adaptive test of independence based on HSIC measures

Dependence measures based on reproducing kernel Hilbert spaces, also known as Hilbert-Schmidt Independence Criterion and denoted HSIC, are widely used to statistically decide whether or not two random vectors are dependent. Recently,…

Statistics Theory · Mathematics 2021-01-13 Mélisande Albert , Béatrice Laurent , Amandine Marrel , Anouar Meynaoui

DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing

To adapt kernel two-sample and independence testing to complex structured data, aggregation of multiple kernels is frequently employed to boost testing power compared to single-kernel tests. However, we observe a phenomenon that directly…

Machine Learning · Computer Science 2025-10-14 Zhijian Zhou , Xunye Tian , Liuhua Peng , Chao Lei , Antonin Schrab , Danica J. Sutherland , Feng Liu