Related papers: Covariate powered cross-weighted multiple testing
The power of multiple testing procedures can be increased by using weighted p-values (Genovese, Roeder and Wasserman 2005). We derive the optimal weights and we show that the power is remarkably robust to misspecification of these weights.…
The large-scale multiple testing inherent to high throughput biological data necessitates very high statistical stringency and thus true effects in data are difficult to detect unless they have high effect sizes. One solution to this…
The problem of multiple hypothesis testing arises when there are more than one hypothesis to be tested simultaneously for statistical significance. This is a very common situation in many data mining applications. For instance, assessing…
We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights:…
Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied.…
Genetic investigations often involve the testing of vast numbers of related hypotheses simultaneously. To control the overall error rate, a substantial penalty is required, making it difficult to detect signals of moderate strength. To…
Genome-wide association analysis has generated much discussion about how to preserve power to detect signals despite the detrimental effect of multiple testing on power. We develop a weighted multiple testing procedure that facilitates the…
The large-scale multiple testing inherent to high throughput biological data necessitates very high statistical stringency and thus true effects in data are difficult to detect unless they have high effect sizes. One promising approach for…
We propose a method for multiple hypothesis testing with familywise error rate (FWER) control, called the i-FWER test. Most testing methods are predefined algorithms that do not allow modifications after observing the data. However, in…
This paper proposes a novel testing procedure for selecting a sparse set of covariates that explains a large dimensional panel. Our selection method provides correct false detection control while having higher power than existing…
Scholars frequently use covariate balance tests to test the validity of natural experiments and related designs. Unfortunately, when measured covariates are unrelated to potential outcomes, balance is uninformative about key identification…
Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently…
A platform trial with a master protocol provides an infrastructure to ethically and efficiently evaluate multiple treatment options in multiple diseases. Given that certain study drugs can enter or exit a platform trial, the randomization…
Current statistical inference problems in areas like astronomy, genomics, and marketing routinely involve the simultaneous testing of thousands -- even millions -- of null hypotheses. For high-dimensional multivariate distributions, these…
In this paper, I try to tame "Basu's elephants" (data with extreme selection on observables). I propose new practical large-sample and finite-sample methods for estimating and inferring heterogeneous causal effects (under unconfoundedness)…
Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$). In order to make…
Borrowing external data can improve estimation efficiency but may introduce bias when populations differ in covariate distributions or outcome variability. A proper balance needs to be maintained between the two datasets to justify the…
This paper introduces novel weighted conformal p-values and methods for model-free selective inference. The problem is as follows: given test units with covariates $X$ and missing responses $Y$, how do we select units for which the…
The use of weights provides an effective strategy to incorporate prior domain knowledge in large-scale inference. This paper studies weighted multiple testing in a decision-theoretic framework. We develop oracle and data-driven procedures…
Selection bias can hinder accurate estimation of association parameters in binary disease risk models using non-probability samples like electronic health records (EHRs). The issue is compounded when participants are recruited from multiple…