Related papers: Extremely efficient permutation and bootstrap hypo…

Fast computation of p-values for the permutation test based on Pearson's correlation coefficient and other statistical tests

Permutation tests are among the simplest and most widely used statistical tools. Their p-values can be computed by a straightforward sampling of permutations. However, this way of computing p-values is often so slow that it is replaced by…

Computation · Statistics 2018-07-27 Jean-Marie Droz

Efficiently estimating small p-values in permutation tests using importance sampling and cross-entropy method

Permutation tests are widely used for statistical hypothesis testing when the sampling distribution of the test statistic under the null hypothesis is analytically intractable or unreliable due to finite sample sizes. One critical challenge…

Computation · Statistics 2023-08-29 Yang Shi , Huining Kang , Ji-Hyun Lee , Hui Jiang

Derivation of Analytic Formulas for the Sample Moments of the Sample Correlation over Permutations of Data

Pearson's correlation is among the mostly widely reported measures of association. The strength of the statistical evidence for linear association is determined by the p-value of a hypothesis test. If the true distribution of a dataset is…

Statistics Theory · Mathematics 2021-08-31 Marc Jaffrey , Michael Dushkoff

Analytic Formulas for the Sample Moments of the Sample Correlation over Permutations of Data

Presented is an inductive formula for computing the sample moments of the distribution of Pearson's sample correlation over permutation of data. These exact formulas for the sample moments suggest the possibility of more precise and…

Statistics Theory · Mathematics 2021-08-31 Marc Jaffrey , Michael Dushkoff

Computationally efficient permutation tests for the multivariate two-sample problem based on energy distance or maximum mean discrepancy statistics

Non-parametric two-sample tests based on energy distance or maximum mean discrepancy are widely used statistical tests for comparing multivariate data from two populations. While these tests enjoy desirable statistical properties, their…

Computation · Statistics 2024-06-11 Elias Chaibub Neto

Resampling-Based Multisplit Inference for High-Dimensional Regression

We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make…

Methodology · Statistics 2022-05-26 Anna Vesely , Jelle J. Goeman , Livio Finos

A high-dimensional two-sample test for the mean using random subspaces

A common problem in genetics is that of testing whether a set of highly dependent gene expressions differ between two populations, typically in a high-dimensional setting where the data dimension is larger than the sample size. Most…

Methodology · Statistics 2015-03-11 Måns Thulin

On the Permutation Distribution of Independence Tests

One of the most popular class of tests for independence between two random variables is the general class of rank statistics which are invariant under permutations. This class contains Spearman's coefficient of rank correlation statistic,…

Computation · Statistics 2009-02-04 Ehab F. Abd-Elfattah

Randomized p-values for multiple testing and their application in replicability analysis

We are concerned with testing replicability hypotheses for many endpoints simultaneously. This constitutes a multiple test problem with composite null hypotheses. Traditional $p$-values, which are computed under least favourable parameter…

Methodology · Statistics 2020-02-26 Anh-Tuan Hoang , Thorsten Dickhaus

Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn

Permutation tests are amongst the most commonly used statistical tools in modern genomic research, a process by which p-values are attached to a test statistic by randomly permuting the sample or gene labels. Yet permutation p-values…

Applications · Statistics 2016-03-21 Belinda Phipson , Gordon K. Smyth

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values

Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may…

Methodology · Statistics 2024-09-05 F. Richard Guo , Rajen D. Shah

Permutation p-value approximation via generalized Stolarsky invariance

It is common for genomic data analysis to use $p$-values from a large number of permutation tests. The multiplicity of tests may require very tiny $p$-values in order to reject any null hypotheses and the common practice of using randomly…

Statistics Theory · Mathematics 2017-08-10 Hera Yu He , Kinjal Basu , Qingyuan Zhao , Art B. Owen

Testing independence with high-dimensional correlated samples

Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging $n$ identically distributed $p$-dimensional random vectors into a $p \times n$ data matrix, we investigate…

Statistics Theory · Mathematics 2017-03-28 Xi Chen , Weidong Liu

Permutation tests using arbitrary permutation distributions

Permutation tests date back nearly a century to Fisher's randomized experiments, and remain an immensely popular statistical tool, used for testing hypotheses of independence between variables and other common inferential questions. Much of…

Methodology · Statistics 2022-12-05 Aaditya Ramdas , Rina Foygel Barber , Emmanuel J. Candes , Ryan J. Tibshirani

Optimal P-value Weighting with Independent Information

The large-scale multiple testing inherent to high throughput biological data necessitates very high statistical stringency and thus true effects in data are difficult to detect unless they have high effect sizes. One solution to this…

Methodology · Statistics 2017-12-21 Mohamad S. Hasan

Inference with Sequential Monte-Carlo Computation of $p$-values: Fast and Valid Approaches

Hypothesis tests calibrated by (re)sampling methods (such as permutation, rank and bootstrap tests) are useful tools for statistical analysis, at the computational cost of requiring Monte-Carlo sampling for calibration. It is common and…

Methodology · Statistics 2024-09-30 Ivo V. Stoepker , Rui M. Castro

Multivariate quantile-based permutation tests with application to functional data

Permutation tests enable testing statistical hypotheses in situations when the distribution of the test statistic is complicated or not available. In some situations, the test statistic under investigation is multivariate, with the multiple…

Methodology · Statistics 2023-11-08 Zdeněk Hlávka , Daniel Hlubinka , Šárka Hudecová

A note on the distribution of the partial correlation coefficient with nonparametrically estimated marginal regressions

There has been much interest in the nonparametric testing of conditional independence in the econometric and statistical literature, but the simplest and potentially most useful method, based on the sample partial correlation, seems to have…

Statistics Theory · Mathematics 2020-05-27 Wicher Bergsma

Generalized R-squared for Detecting Dependence

Detecting dependence between two random variables is a fundamental problem. Although the Pearson correlation is effective for capturing linear dependency, it can be entirely powerless for detecting nonlinear and/or heteroscedastic patterns.…

Methodology · Statistics 2016-11-21 Xufei Wang , Bo Jiang , Jun S. Liu

Detecting changes in cross-sectional dependence in multivariate time series

Classical and more recent tests for detecting distributional changes in multivariate time series often lack power against alternatives that involve changes in the cross-sectional dependence structure. To be able to detect such changes…

Statistics Theory · Mathematics 2014-09-16 Axel Bücher , Ivan Kojadinovic , Tom Rohmer , Johan Segers