English
Related papers

Related papers: Two-sample Testing Using Deep Learning

200 papers

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test…

Machine Learning · Statistics 2021-01-15 Feng Liu , Wenkai Xu , Jie Lu , Guangquan Zhang , Arthur Gretton , Danica J. Sutherland

The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that…

Machine Learning · Statistics 2022-06-01 Xiuyuan Cheng , Alexander Cloninger

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard…

Machine Learning · Computer Science 2023-01-18 Jonas M. Kübler , Vincent Stimper , Simon Buchholz , Krikamol Muandet , Bernhard Schölkopf

Deep neural two-sample tests have recently shown strong power for detecting distributional differences between groups, yet their black-box nature limits interpretability and practical adoption in biomedical analysis. Moreover, most existing…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Masoumeh Javanbakhat , Piotr Komorowski , Dilyara Bareeva , Wei-Chang Lai , Wojciech Samek , Christoph Lippert

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

In this paper, we address the problem of two-sample testing in the presence of missing data under a variety of missingness mechanisms. Our focus is on the well-known energy distance-based two-sample test. In addition to the standard…

Methodology · Statistics 2025-08-18 Danijel G. Aleksić , Bojana Milošević

The focus of this study is to evaluate the effectiveness of Machine Learning (ML) methods for two-sample testing with right-censored observations. To achieve this, we develop several ML-based methods with varying architectures and implement…

Machine Learning · Computer Science 2024-09-27 Petr Philonenko , Sergey Postovalov

The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…

Statistics Theory · Mathematics 2023-02-09 Stephan Clémençon , Myrto Limnios , Nicolas Vayatis

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses…

Machine Learning · Statistics 2015-06-16 Kacper Chwialkowski , Aaditya Ramdas , Dino Sejdinovic , Arthur Gretton

We consider the problem of two-sample testing in a semi-supervised setting with abundant unlabeled covariate data. Standard two-sample tests neglect covariate information, which has the potential to significantly boost performance. However,…

Machine Learning · Statistics 2026-05-05 Gyumin Lee , Shubhanshu Shekhar , Ilmun Kim

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification…

Machine Learning · Statistics 2017-11-15 Hamed Masnadi-Shirazi

We consider testing for two-sample means of high dimensional populations by thresholding. Two tests are investigated, which are designed for better power performance when the two population mean vectors differ only in sparsely populated…

Methodology · Statistics 2014-10-13 Song Xi Chen , Jun Li , Ping-Shou Zhong

Learning effective data representations has been crucial in non-parametric two-sample testing. Common approaches will first split data into training and test sets and then learn data representations purely on the training set. However,…

Machine Learning · Computer Science 2025-05-09 Xunye Tian , Liuhua Peng , Zhijian Zhou , Mingming Gong , Arthur Gretton , Feng Liu

We construct and analyze a neural network two-sample test to determine whether two datasets came from the same distribution (null hypothesis) or not (alternative hypothesis). We perform time-analysis on a neural tangent kernel (NTK)…

Machine Learning · Statistics 2024-07-11 Varun Khurana , Xiuyuan Cheng , Alexander Cloninger

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming…

Machine Learning · Statistics 2022-01-06 Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and…

Machine Learning · Computer Science 2023-08-10 Behnam Khojasteh , Friedrich Solowjow , Sebastian Trimpe , Katherine J. Kuchenbecker

Data depth has been applied as a nonparametric measurement for ranking multivariate samples. In this paper, we focus on homogeneity tests to assess whether two multivariate samples are from the same distribution. There are many data…

Statistics Theory · Mathematics 2023-06-09 Yiting Chen , Wei Lin , Xiaoping Shi
‹ Prev 1 2 3 10 Next ›