Related papers: A new graph-based two-sample test for multivariate…
Two-sample tests for multivariate data and non-Euclidean data are widely used in many fields. Parametric tests are mostly restrained to certain types of data that meets the assumptions of the parametric models. In this paper, we study a…
Testing the equality in distributions of multiple samples is a common task in many fields. However, this problem for high-dimensional or non-Euclidean data has not been well explored. In this paper, we propose new nonparametric tests based…
In the regime of two-sample comparison, tests based on a graph constructed on observations by utilizing similarity information among them is gaining attention due to their flexibility and good performances for high-dimensional/non-Euclidean…
Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse…
Testing for the equality of two high-dimensional distributions is a challenging problem, and this becomes even more challenging when the sample size is small. Over the last few decades, several graph-based two-sample tests have been…
Repeated observations have become increasingly common in biomedical research and longitudinal studies. For instance, wearable sensor devices are deployed to continuously track physiological and biological signals from each individual over…
Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph…
We study the problem of two-sample comparison with categorical data when the contingency table is sparsely populated. In modern applications, the number of categories is often comparable to the sample size, causing existing methods to have…
Graph-based tests are a class of non-parametric two-sample tests useful for analyzing high-dimensional data. The test statistics are constructed from similarity graphs (such as K-minimum spanning tree), and consequently, their performance…
Data depth has been applied as a nonparametric measurement for ranking multivariate samples. In this paper, we focus on homogeneity tests to assess whether two multivariate samples are from the same distribution. There are many data…
In this paper, we study the problem of testing the equality of two multivariate distributions. One class of tests used for this purpose utilizes geometric graphs constructed using inter-point distances. So far, the asymptotic theory of…
Network data is a major object data type that has been widely collected or derived from common sources such as brain imaging. Such data contains numeric, topological, and geometrical information, and may be necessarily considered in certain…
Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies…
In this article, we revisit and expand our prior work on graph similarity. As with our earlier work, we focus on a view of similarity which does not require node correspondence between graphs under comparison. Our work is suited to the…
We present the results of a large number of simulation studies regarding the power of various non-parametric two-sample tests for multivariate data. This includes both continuous and discrete data. In general no single method can be relied…
Most existing methods for testing equality of means of functional data from multiple populations rely on assumptions of equal covariance and/or Gaussianity. In this work we provide a new testing method based on a statistic that is…
The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional…
We consider multivariate two-sample tests of means, where the location shift between the two populations is expected to be related to a known graph structure. An important application of such tests is the detection of differentially…
We consider the problem of two-sample testing in a semi-supervised setting with abundant unlabeled covariate data. Standard two-sample tests neglect covariate information, which has the potential to significantly boost performance. However,…
Rank-based approaches are among the most popular nonparametric methods for univariate data in tackling statistical problems such as hypothesis testing due to their robustness and effectiveness. However, they are unsatisfactory for more…