English
Related papers

Related papers: Comparing distributions: $\ell_1$ geometry improve…

200 papers

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses…

Machine Learning · Statistics 2015-06-16 Kacper Chwialkowski , Aaditya Ramdas , Dino Sejdinovic , Arthur Gretton

We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over…

Machine Learning · Computer Science 2008-05-16 Arthur Gretton , Karsten Borgwardt , Malte J. Rasch , Bernhard Scholkopf , Alexander J. Smola

Distance-based tests, also called "energy statistics", are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from "kernel mean embeddings", are leading methods for two-sample…

Machine Learning · Statistics 2024-06-27 Cencheng Shen , Joshua T. Vogelstein

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed…

Machine Learning · Statistics 2018-09-03 Xiuyuan Cheng , Alexander Cloninger , Ronald R. Coifman

In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed…

Methodology · Statistics 2024-08-30 Anirban Chatterjee , Ziang Niu , Bhaswar B. Bhattacharya

Testing the equality of two conditional distributions is crucial in various modern applications, including transfer learning and causal inference. Despite its importance, this fundamental problem has received surprisingly little attention…

Methodology · Statistics 2025-09-04 Jian Yan , Zhuoxi Li , Xianyang Zhang

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do…

Methodology · Statistics 2023-11-21 Hoseung Song , Hao Chen

Nonparametric two sample testing deals with the question of consistently deciding if two distributions are different, given samples from both, without making any parametric assumptions about the form of the distributions. The current…

Statistics Theory · Mathematics 2014-11-25 Aaditya Ramdas , Sashank J. Reddi , Barnabas Poczos , Aarti Singh , Larry Wasserman

Starting with a similarity function between objects, it is possible to define a distance metric on pairs of objects, and more generally on probability distributions over them. These distance metrics have a deep basis in functional analysis,…

Computational Geometry · Computer Science 2011-03-15 Sarang Joshi , Raj Varma Kommaraju , Jeff M. Phillips , Suresh Venkatasubramanian

Nonparametric two sample testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. We refer to the most common…

Statistics Theory · Mathematics 2015-08-05 Aaditya Ramdas , Sashank J. Reddi , Barnabas Poczos , Aarti Singh , Larry Wasserman

We use a suitable version of the so-called "kernel trick" to devise two-sample (homogeneity) tests, especially focussed on high-dimensional and functional data. Our proposal entails a simplification related to the important practical…

Statistics Theory · Mathematics 2024-04-24 Javier Cárcamo , Antonio Cuevas , Luis-Alberto Rodríguez

We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method…

Statistics Theory · Mathematics 2022-05-10 Jie Wang , Rui Gao , Yao Xie

This paper is about two related decision theoretic problems, nonparametric two-sample testing and independence testing. There is a belief that two recently proposed solutions, based on kernels and distances between pairs of points, behave…

Machine Learning · Statistics 2014-11-25 Sashank J. Reddi , Aaditya Ramdas , Barnabás Póczos , Aarti Singh , Larry Wasserman

Kernel mean embeddings are a popular tool that consists in representing probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. When the kernel is characteristic, mean embeddings can be used…

Machine Learning · Computer Science 2021-06-29 Boris Muzellec , Francis Bach , Alessandro Rudi

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming…

Machine Learning · Statistics 2022-01-06 Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

In many contemporary statistical and machine learning methods, one needs to optimize an objective function that depends on the discrepancy between two probability distributions. The discrepancy can be referred to as a metric for…

Machine Learning · Computer Science 2025-02-11 Yijin Ni , Xiaoming Huo

We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional differences, because failure to reject the null…

Machine Learning · Statistics 2026-03-17 Xing Liu , Axel Gandy

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, distances between…

Machine Learning · Computer Science 2015-03-20 Dino Sejdinovic , Arthur Gretton , Bharath Sriperumbudur , Kenji Fukumizu
‹ Prev 1 2 3 10 Next ›