English
Related papers

Related papers: Two-Sample Testing in High-Dimensional Models

200 papers

In this paper, we propose a new test for testing the equality of two population covariance matrices in the ultra-high dimensional setting that the dimension is much larger than the sizes of both of the two samples. Our proposed methodology…

Methodology · Statistics 2023-12-19 Xiucai Ding , Yichen Hu , Zhenggang Wang

We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Specifically, we…

Statistics Theory · Mathematics 2015-09-15 Miles E. Lopes , Laurent J. Jacob , Martin J. Wainwright

Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid p-values are not available. An exception is a…

Methodology · Statistics 2009-06-12 Nicolai Meinshausen , Lukas Meier , Peter Bühlmann

We consider testing for two-sample means of high dimensional populations by thresholding. Two tests are investigated, which are designed for better power performance when the two population mean vectors differ only in sparsely populated…

Methodology · Statistics 2014-10-13 Song Xi Chen , Jun Li , Ping-Shou Zhong

We propose two tests for the equality of covariance matrices between two high-dimensional populations. One test is on the whole variance--covariance matrices, and the other is on off-diagonal sub-matrices, which define the covariance…

Statistics Theory · Mathematics 2012-06-06 Jun Li , Song Xi Chen

The classic likelihood ratio test for testing the equality of two covariance matrices breakdowns due to the singularity of the sample covariance matrices when the data dimension $p$ is larger than the sample size $n$. In this paper, we…

Methodology · Statistics 2015-11-06 Tung-Lung Wu , Ping Li

Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between…

Data Analysis, Statistics and Probability · Physics 2024-08-02 Jeremy J. H. Wilkinson , Christopher G. Lester

We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and…

Methodology · Statistics 2024-05-01 Teodora Pandeva , Tim Bakker , Christian A. Naesseth , Patrick Forré

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…

Methodology · Statistics 2019-11-19 Ilmun Kim , Ann B. Lee , Jing Lei

A common problem in genetics is that of testing whether a set of highly dependent gene expressions differ between two populations, typically in a high-dimensional setting where the data dimension is larger than the sample size. Most…

Methodology · Statistics 2015-03-11 Måns Thulin

We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing…

The energy test is a powerful binning-free, multi-dimensional and distribution-free tool that can be applied to compare a measurement to a given prediction (goodness-of-fit) or to check whether two data samples originate from the same…

Data Analysis, Statistics and Probability · Physics 2018-04-30 G. Zech

Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may…

Methodology · Statistics 2024-09-05 F. Richard Guo , Rajen D. Shah

In this paper we consider testing the equality of probability vectors of two independent multinomial distributions in high dimension. The classical chi-square test may have some drawbacks in this case since many of cell counts may be zero…

Statistics Theory · Mathematics 2017-11-16 Amanda Plunkett , Junyong Park

We propose optimal Bayesian two-sample tests for testing equality of high-dimensional mean vectors and covariance matrices between two populations. In many applications including genomics and medical imaging, it is natural to assume that…

Methodology · Statistics 2021-12-07 Kyoungjae Lee , Kisung You , Lizhen Lin

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Cluster analysis is a fundamental research issue in statistics and machine learning. In many modern clustering methods, we need to determine whether two subsets of samples come from the same cluster. Since these subsets are usually…

Machine Learning · Computer Science 2025-07-15 Xinying Liu , Lianyu Hu , Mudi Jiang , Simeng Zhang , Jun Lou , Zengyou He

We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical $T^2$ test does not work for this "large $p$, small $n$" situation. The proposed test…

Statistics Theory · Mathematics 2010-02-25 Song Xi Chen , Ying-Li Qin

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

High-dimensional statistical inference with general estimating equations are challenging and remain less explored. In this paper, we study two problems in the area: confidence set estimation for multiple components of the model parameters,…

Methodology · Statistics 2021-04-28 Jinyuan Chang , Song Xi Chen , Cheng Yong Tang , Tong Tong Wu
‹ Prev 1 2 3 10 Next ›