English
Related papers

Related papers: Kernel Two-Sample Hypothesis Testing Using Kernel …

200 papers

We present a general framework for hypothesis testing on distributions of sets of individual examples. Sets may represent many common data sources such as groups of observations in time series, collections of words in text or a batch of…

Methodology · Statistics 2021-02-03 Alexis Bellot , Mihaela van der Schaar

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where…

Methodology · Statistics 2022-01-25 Qinyi Zhang , Veit Wild , Sarah Filippi , Seth Flaxman , Dino Sejdinovic

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming…

Machine Learning · Statistics 2022-01-06 Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do…

Methodology · Statistics 2023-11-21 Hoseung Song , Hao Chen

We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over…

Machine Learning · Computer Science 2008-05-16 Arthur Gretton , Karsten Borgwardt , Malte J. Rasch , Bernhard Scholkopf , Alexander J. Smola

Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and…

Machine Learning · Computer Science 2023-08-10 Behnam Khojasteh , Friedrich Solowjow , Sebastian Trimpe , Katherine J. Kuchenbecker

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

We propose a new one-sample test for normality in a Reproducing Kernel Hilbert Space (RKHS). Namely, we test the null-hypothesis of belonging to a given family of Gaussian distributions. Hence our procedure may be applied either to test…

Statistics Theory · Mathematics 2015-07-13 Jérémie Kellner , Alain Celisse

We study strictly proper scoring rules in the Reproducing Kernel Hilbert Space. We propose a general Kernel Scoring rule and associated Kernel Divergence. We consider conditions under which the Kernel Score is strictly proper. We then…

Machine Learning · Statistics 2017-04-25 Hamed Masnadi-Shirazi

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Model misspecification can create significant challenges for the implementation of probabilistic models, and this has led to development of a range of robust methods which directly account for this issue. However, whether these more…

Machine Learning · Statistics 2025-04-22 Oscar Key , Arthur Gretton , François-Xavier Briol , Tamara Fernandez

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…

Machine Learning · Statistics 2018-03-14 David Lopez-Paz , Maxime Oquab

Anomaly detection based on one-class classification algorithms is broadly used in many applied domains like image processing (e.g. detection of whether a patient is "cancerous" or "healthy" from mammography image), network intrusion…

Machine Learning · Statistics 2017-07-14 Evgeny Burnaev , Pavel Erofeev , Dmitry Smolyakov

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…

Methodology · Statistics 2025-10-03 Hoseung Song , Hao Chen

We consider the problem of two-sample testing in a semi-supervised setting with abundant unlabeled covariate data. Standard two-sample tests neglect covariate information, which has the potential to significantly boost performance. However,…

Machine Learning · Statistics 2026-05-05 Gyumin Lee , Shubhanshu Shekhar , Ilmun Kim

Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how…

Statistics Theory · Mathematics 2021-09-08 Boris Landa , Rihao Qu , Joseph Chang , Yuval Kluger

In supervised learning with distributional inputs in the two-stage sampling setup, relevant to applications like learning-based medical screening or causal learning, the inputs (which are probability distributions) are not accessible in the…

Machine Learning · Computer Science 2026-01-22 Christian Fiedler
‹ Prev 1 2 3 10 Next ›