English
Related papers

Related papers: A uniform kernel trick for high-dimensional two-sa…

200 papers

Testing the equality of two conditional distributions is crucial in various modern applications, including transfer learning and causal inference. Despite its importance, this fundamental problem has received surprisingly little attention…

Methodology · Statistics 2025-09-04 Jian Yan , Zhuoxi Li , Xianyang Zhang

We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over…

Machine Learning · Computer Science 2008-05-16 Arthur Gretton , Karsten Borgwardt , Malte J. Rasch , Bernhard Scholkopf , Alexander J. Smola

Distance-based tests, also called "energy statistics", are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from "kernel mean embeddings", are leading methods for two-sample…

Machine Learning · Statistics 2024-06-27 Cencheng Shen , Joshua T. Vogelstein

We propose a nonparametric two-sample test procedure based on Maximum Mean Discrepancy (MMD) for testing the hypothesis that two samples of functions have the same underlying distribution, using kernels defined on function spaces. This…

Statistics Theory · Mathematics 2020-10-20 George Wynne , Andrew B. Duncan

In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed…

Methodology · Statistics 2024-08-30 Anirban Chatterjee , Ziang Niu , Bhaswar B. Bhattacharya

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where…

Methodology · Statistics 2022-01-25 Qinyi Zhang , Veit Wild , Sarah Filippi , Seth Flaxman , Dino Sejdinovic

We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method…

Statistics Theory · Mathematics 2022-05-10 Jie Wang , Rui Gao , Yao Xie

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

In many contemporary statistical and machine learning methods, one needs to optimize an objective function that depends on the discrepancy between two probability distributions. The discrepancy can be referred to as a metric for…

Machine Learning · Computer Science 2025-02-11 Yijin Ni , Xiaoming Huo

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

To adapt kernel two-sample and independence testing to complex structured data, aggregation of multiple kernels is frequently employed to boost testing power compared to single-kernel tests. However, we observe a phenomenon that directly…

Machine Learning · Computer Science 2025-10-14 Zhijian Zhou , Xunye Tian , Liuhua Peng , Chao Lei , Antonin Schrab , Danica J. Sutherland , Feng Liu

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space…

Methodology · Statistics 2025-08-21 Rui Cui , Yuhao Li , Xiaojun Song

Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do…

Methodology · Statistics 2023-11-21 Hoseung Song , Hao Chen

Are two sets of observations drawn from the same distribution? This problem is a two-sample test. Kernel methods lead to many appealing properties. Indeed state-of-the-art approaches use the $L^2$ distance between kernel-based distribution…

Machine Learning · Statistics 2019-10-02 M. Scetbon , G. Varoquaux

Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and…

Machine Learning · Computer Science 2023-08-10 Behnam Khojasteh , Friedrich Solowjow , Sebastian Trimpe , Katherine J. Kuchenbecker

Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems. This is particularly relevant for data generated by dynamical systems since such systems are essential for many…

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with…

Machine Learning · Statistics 2025-12-30 Antonin Schrab

In this paper we deal with the problem of testing for the equality of $k$ probability distributions defined on $(\mathcal{X},\mathcal{B})$, where $\mathcal{X}$ is a metric space and $\mathcal{B}$ is the corresponding Borel $\sigma$-field.…

Statistics Theory · Mathematics 2018-12-04 Armando Sosthene Kali Balogoun , Guy Martial Nkiet , Carlos Ogouyandjou

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman
‹ Prev 1 2 3 10 Next ›