Related papers: Two-Sample Test Based on Classification Probabilit…
Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…
The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that…
We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and…
Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…
Following the line of classification-based two-sample testing, tests based on the Random Forest classifier are proposed. The developed tests are easy to use, require almost no tuning, and are applicable for any distribution on…
The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…
So-called linear rank statistics provide a means for distribution-free (even in finite samples), yet highly flexible, two-sample testing in the setting of univariate random variables. Their flexibility derives from a choice of weights that…
We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…
The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…
This paper introduces a statistical test inferring whether a variable allows separating two classes by means of a single critical value. Its test statistic is the prediction error of a nonparametric threshold classifier. While this approach…
We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and…
Despite their importance in supporting experimental conclusions, standard statistical tests are often inadequate for research areas, like the life sciences, where the typical sample size is small and the test assumptions difficult to…
Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…
A number of applications require two-sample testing on ranked preference data. For instance, in crowdsourcing, there is a long-standing question of whether pairwise comparison data provided by people is distributed similar to…
Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between…
System modeling is a classical approach to ensure their reliability since it is suitable both for a formal verification and for software testing techniques. In the context of model-based testing an approach combining random testing and…
Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…
A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample…
We propose novel methodology for testing equality of model parameters between two high-dimensional populations. The technique is very general and applicable to a wide range of models. The method is based on sample splitting: the data is…
Parametric hypothesis testing associated with two independent samples arises frequently in several applications in biology, medical sciences, epidemiology, reliability and many more. In this paper, we propose robust Wald-type tests for…