Related papers: On two-sample testing for data with arbitrarily mi…
In many real-world applications, it is common that a proportion of the data may be missing or only partially observed. We develop a novel two-sample testing method based on the Maximum Mean Discrepancy (MMD) which accounts for missing data…
This work proposes a novel rank-based scale two-sample testing method for univariate, distinct data when a subset of the data may be missing. Our approach is based on mathematically tight bounds of the Ansari-Bradley test statistic in the…
In this paper, we address the problem of two-sample testing in the presence of missing data under a variety of missingness mechanisms. Our focus is on the well-known energy distance-based two-sample test. In addition to the standard…
A fundamental challenge in comparing two survival distributions with right censored data is the selection of an appropriate nonparametric test, as the power of standard tests like the Log rank and Wilcoxon is highly dependent on the often…
Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism…
Detecting and locating changes in highly multivariate data is a major concern in several current statistical applications. In this context, the first contribution of the paper is a novel non-parametric two-sample homogeneity test for…
The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…
Nonparametric tests for functional data are a challenging class of tests to work with because of the potentially high dimensional nature of the data. One of the main challenges for considering rank-based tests, like the Mann-Whitney or…
Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a…
Missing data is a common issue in many biomedical studies. Under a paired design, some subjects may have missing values in either one or both of the conditions due to loss of follow-up, insufficient biological samples, etc. Such partially…
The Wilcoxon-Mann-Whitney test is a robust competitor of the t-test in the univariate setting. For finite dimensional multivariate data, several extensions of the Wilcoxon-Mann-Whitney test have been shown to have better performance than…
There are many different proposed procedures for sample size planning for the Wilcoxon-Mann-Whitney test at given type-I and type-II error rates $\alpha$ and $\beta$, respectively. Most methods assume very specific models or types of data…
We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…
Multi-source and multi-modal datasets are increasingly common in scientific research, yet they often exhibit block-wise missingness, where entire modalities are systematically absent in some sources or no single source contains all…
Given independent samples from two univariate distributions, the one-sided Wilcoxon-Mann-Whitney statistic may be used to conduct a rank-based test of first-order stochastic dominance. We broaden the scope of applicability of such tests by…
The Wilcoxon signed-rank test and the Wilcoxon-Mann-Whitney test are commonly employed in one sample and two sample mean tests for one-dimensional hypothesis problems. For high-dimensional mean test problems, we calculate the asymptotic…
The issue of missing values is an arising difficulty when dealing with paired data. Several test procedures are developed in the literature to tackle this problem. Some of them are even robust under deviations and control type-I error quite…
Rank-based approaches are among the most popular nonparametric methods for univariate data in tackling statistical problems such as hypothesis testing due to their robustness and effectiveness. However, they are unsatisfactory for more…
Although unbiasedness is a basic property of a good test, many tests on vector parameters or scalar parameters against two-sided alternatives are not finite-sample unbiased. This was already noticed by Sugiura [Ann. Inst. Statist. Math. 17…
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on…