Related papers: Robustness of multiple testing procedures against …

Strong approximations of level exceedences related to multiple hypothesis testing

Particularly in genomics, but also in other fields, it has become commonplace to undertake highly multiple Student's $t$-tests based on relatively small sample sizes. The literature on this topic is continually expanding, but the main…

Statistics Theory · Mathematics 2010-10-11 Peter Hall , Qiying Wang

A New Step-down Procedure for Simultaneous Hypothesis Testing Under Dependence

In this article, we consider the problem of simultaneous testing of hypotheses when the individual test statistics are not necessarily independent. Specifically, we consider the problem of simultaneous testing of point null hypotheses…

Statistics Theory · Mathematics 2018-07-17 Prasenjit Ghosh , Arijit Chakrabarti

Post-clustering difference testing: valid inference and practical considerations

Clustering is part of unsupervised analysis methods that consist in grouping samples into homogeneous and separate subgroups of observations also called clusters. To interpret the clusters, statistical hypothesis testing is often used to…

Methodology · Statistics 2022-10-25 Benjamin Hivert , Denis Agniel , Rodolphe Thiébaut , Boris P Hejblum

False Discovery Variance Reduction in Large Scale Simultaneous Hypothesis Tests

Statistical dependence between hypotheses poses a significant challenge to the stability of large scale multiple hypotheses testing. Ignoring it often results in an unacceptably large spread in the false positive proportion even though the…

Methodology · Statistics 2018-10-15 Sairam Rayaprolu , Zhiyi Chi

A Practitioner's Guide to Multiple Testing Error Rates

It is quite common in modern research, for a researcher to test many hypotheses. The statistical (frequentist) hypothesis testing framework, does not scale with the number of hypotheses in the sense that naively performing many hypothesis…

Methodology · Statistics 2013-06-26 Jonathan Rosenblatt

Two-cluster test

Cluster analysis is a fundamental research issue in statistics and machine learning. In many modern clustering methods, we need to determine whether two subsets of samples come from the same cluster. Since these subsets are usually…

Machine Learning · Computer Science 2025-07-15 Xinying Liu , Lianyu Hu , Mudi Jiang , Simeng Zhang , Jun Lou , Zengyou He

How to Tell When a Result Will Replicate: Significance and Replication in Distributional Null Hypothesis Tests

There is a well-known problem in Null Hypothesis Significance Testing: many statistically significant results fail to replicate in subsequent experiments. We show that this problem arises because standard `point-form null' significance…

Methodology · Statistics 2025-02-06 Fintan Costello , Paul Watts

Multiscale Fisher's Independence Test for Multivariate Dependence

Identifying dependency in multivariate data is a common inference task that arises in numerous applications. However, existing nonparametric independence tests typically require computation that scales at least quadratically with the sample…

Methodology · Statistics 2021-07-08 Shai Gorsky , Li Ma

On Hypothesis Testing for Poisson Processes. Singular Cases

We consider the problem of hypothesis testing in the situation where the first hypothesis is simple and the second one is local one-sided composite. We describe the choice of the thresholds and the power functions of different tests when…

Statistics Theory · Mathematics 2015-02-25 Serguei Dachian , Yury Kutoyants , Lin Yang

Compound p-Value Statistics for Multiple Testing Procedures

Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently…

Methodology · Statistics 2011-08-25 Joshua D. Habiger , Edsel A. Pena

Testing independence with high-dimensional correlated samples

Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging $n$ identically distributed $p$-dimensional random vectors into a $p \times n$ data matrix, we investigate…

Statistics Theory · Mathematics 2017-03-28 Xi Chen , Weidong Liu

Optimal Prediction-Augmented Algorithms for Testing Independence of Distributions

Independence testing is a fundamental problem in statistical inference: given samples from a joint distribution $p$ over multiple random variables, the goal is to determine whether $p$ is a product distribution or is $\epsilon$-far from all…

Machine Learning · Statistics 2026-03-06 Maryam Aliakbarpour , Alireza Azizi , Ria Stevens

On the universal calibration of heavy-tailed combination tests

It is often of interest to test a global null hypothesis using multiple, possibly dependent $p$-values by combining their strengths while controlling the type-I error. Recently, several heavy-tailed combination tests, such as the harmonic…

Statistics Theory · Mathematics 2026-03-25 Parijat Chakraborty , F. Richard Guo , Kerby Shedden , Stilian Stoev

Sequential Predictive Two-Sample and Independence Testing

We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data,…

Machine Learning · Statistics 2023-07-21 Aleksandr Podkopaev , Aaditya Ramdas

On distinguishability of hypotheses

We consider the problems of hypothesis testing on a probability measure of independent sample, on solution of ill-posed problem, on deconvolution problem and on Poisson mean measure. For all these setups necessary conditions and sufficient…

Statistics Theory · Mathematics 2013-10-24 Mikhail Ermakov

Asymptotic Distribution-Free Independence Test for High Dimension Data

Test of independence is of fundamental importance in modern data analysis, with broad applications in variable selection, graphical models, and causal inference. When the data is high dimensional and the potential dependence signal is…

Methodology · Statistics 2023-06-13 Zhanrui Cai , Jing Lei , Kathryn Roeder

Differentially Private Nonparametric Hypothesis Testing

Hypothesis tests are a crucial statistical tool for data mining and are the workhorse of scientific research in many fields. Here we study differentially private tests of independence between a categorical and a continuous variable. We take…

Methodology · Statistics 2019-03-25 Simon Couch , Zeki Kazan , Kaiyan Shi , Andrew Bray , Adam Groce

More accurate tests for the statistical significance of result differences

Statistical significance testing of differences in values of metrics like recall, precision and balanced F-score is a necessary part of empirical natural language processing. Unfortunately, we find in a set of experiments that many commonly…

Computation and Language · Computer Science 2007-05-23 Alexander Yeh

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Joint Sequential Detection and Isolation for Dependent Data Streams

The problem of joint sequential detection and isolation is considered in the context of multiple, not necessarily independent, data streams. A multiple testing framework is proposed, where each hypothesis corresponds to a different subset…

Statistics Theory · Mathematics 2022-07-04 Anamitra Chaudhuri , Georgios Fellouris