Related papers: Two-Sample Testing in High-Dimensional Models

Two sample test for covariance matrices in ultra-high dimension

In this paper, we propose a new test for testing the equality of two population covariance matrices in the ultra-high dimensional setting that the dimension is much larger than the sizes of both of the two samples. Our proposed methodology…

Methodology · Statistics 2023-12-19 Xiucai Ding , Yichen Hu , Zhenggang Wang

A More Powerful Two-Sample Test in High Dimensions using Random Projection

We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Specifically, we…

Statistics Theory · Mathematics 2015-09-15 Miles E. Lopes , Laurent J. Jacob , Martin J. Wainwright

P-values for high-dimensional regression

Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid p-values are not available. An exception is a…

Methodology · Statistics 2009-06-12 Nicolai Meinshausen , Lukas Meier , Peter Bühlmann

Two-Sample Tests for High Dimensional Means with Thresholding and Data Transformation

We consider testing for two-sample means of high dimensional populations by thresholding. Two tests are investigated, which are designed for better power performance when the two population mean vectors differ only in sparsely populated…

Methodology · Statistics 2014-10-13 Song Xi Chen , Jun Li , Ping-Shou Zhong

Two sample tests for high-dimensional covariance matrices

We propose two tests for the equality of covariance matrices between two high-dimensional populations. One test is on the whole variance--covariance matrices, and the other is on off-diagonal sub-matrices, which define the covariance…

Statistics Theory · Mathematics 2012-06-06 Jun Li , Song Xi Chen

Tests for High-Dimensional Covariance Matrices Using Random Matrix Projection

The classic likelihood ratio test for testing the equality of two covariance matrices breakdowns due to the singularity of the sample covariance matrices when the data dimension $p$ is larger than the sample size $n$. In this paper, we…

Methodology · Statistics 2015-11-06 Tung-Lung Wu , Ping Li

Statistical divergences in high-dimensional hypothesis testing and a modern technique for estimating them

Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between…

Data Analysis, Statistics and Probability · Physics 2024-08-02 Jeremy J. H. Wilkinson , Christopher G. Lester

E-Valuating Classifier Two-Sample Tests

We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and…

Methodology · Statistics 2024-05-01 Teodora Pandeva , Tim Bakker , Christian A. Naesseth , Patrick Forré

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

A high-dimensional two-sample test for the mean using random subspaces

A common problem in genetics is that of testing whether a set of highly dependent gene expressions differ between two populations, typically in a high-dimensional setting where the data dimension is larger than the sample size. Most…

Methodology · Statistics 2015-03-11 Måns Thulin

PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing…

Machine Learning · Statistics 2025-09-11 Pablo Lemos , Sammy Sharief , Nikolay Malkin , Salma Salhi , Connor Stone , Laurence Perreault-Levasseur , Yashar Hezaveh

Scaling property of the statistical Two-Sample Energy Test

The energy test is a powerful binning-free, multi-dimensional and distribution-free tool that can be applied to compare a measurement to a given prediction (goodness-of-fit) or to check whether two data samples originate from the same…

Data Analysis, Statistics and Probability · Physics 2018-04-30 G. Zech

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values

Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may…

Methodology · Statistics 2024-09-05 F. Richard Guo , Rajen D. Shah

Two-Sample Test for Sparse High Dimensional Multinomial Distributions

In this paper we consider testing the equality of probability vectors of two independent multinomial distributions in high dimension. The classical chi-square test may have some drawbacks in this case since many of cell counts may be zero…

Statistics Theory · Mathematics 2017-11-16 Amanda Plunkett , Junyong Park

Bayesian Optimal Two-sample Tests in High-dimension

We propose optimal Bayesian two-sample tests for testing equality of high-dimensional mean vectors and covariance matrices between two populations. In many applications including genomics and medical imaging, it is natural to assume that…

Methodology · Statistics 2021-12-07 Kyoungjae Lee , Kisung You , Lizhen Lin

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Two-cluster test

Cluster analysis is a fundamental research issue in statistics and machine learning. In many modern clustering methods, we need to determine whether two subsets of samples come from the same cluster. Since these subsets are usually…

Machine Learning · Computer Science 2025-07-15 Xinying Liu , Lianyu Hu , Mudi Jiang , Simeng Zhang , Jun Lou , Zengyou He

A two-sample test for high-dimensional data with applications to gene-set testing

We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical $T^2$ test does not work for this "large $p$, small $n$" situation. The proposed test…

Statistics Theory · Mathematics 2010-02-25 Song Xi Chen , Ying-Li Qin

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

High-dimensional empirical likelihood inference

High-dimensional statistical inference with general estimating equations are challenging and remain less explored. In this paper, we study two problems in the area: confidence set estimation for multiple components of the model parameters,…

Methodology · Statistics 2021-04-28 Jinyuan Chang , Song Xi Chen , Cheng Yong Tang , Tong Tong Wu