Related papers: Doubly robust and computationally efficient high-d…

Variable Selection for Doubly Robust Causal Inference

Confounding control is crucial and yet challenging for causal inference based on observational studies. Under the typical unconfoundness assumption, augmented inverse probability weighting (AIPW) has been popular for estimating the average…

Methodology · Statistics 2023-01-27 Eunah Cho , Shu Yang

Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm

We consider variable selection in high-dimensional linear models where the number of covariates greatly exceeds the sample size. We introduce the new concept of partial faithfulness and use it to infer associations between the covariates…

Methodology · Statistics 2012-01-12 Peter Bühlmann , Markus Kalisch , Marloes H. Maathuis

Prediction of High-Performance Computing Input/Output Variability and Its Application to Optimization for System Configurations

Performance variability is an important measure for a reliable high performance computing (HPC) system. Performance variability is affected by complicated interactions between numerous factors, such as CPU frequency, the number of…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-16 Li Xu , Thomas Lux , Tyler Chang , Bo Li , Yili Hong , Layne Watson , Ali Butt , Danfeng Yao , Kirk Cameron

A More Powerful Two-Sample Test in High Dimensions using Random Projection

We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Specifically, we…

Statistics Theory · Mathematics 2015-09-15 Miles E. Lopes , Laurent J. Jacob , Martin J. Wainwright

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Analyzing principal components for multivariate data from its spatial sign covariance matrix (SCM) has been proposed as a computationally simple and robust alternative to normal PCA, but it suffers from poor efficiency properties and is…

Statistics Theory · Mathematics 2016-03-10 Subhabrata Majumdar

PCM Selector: Penalized Covariate-Mediator Selection Operator for Evaluating Linear Causal Effects

For a data-generating process for random variables that can be described with a linear structural equation model, we consider a situation in which (i) a set of covariates satisfying the back-door criterion cannot be observed or (ii) such a…

Methodology · Statistics 2025-03-06 Hisayoshi Nanmo , Manabu Kuroki

A robust covariance testing approach for high-throughput data

The problem of testing changes in covariance has received increasing attention in recent years, especially in the context of high-dimensional testing. A number of approaches have been proposed, all limited to the two-sample problem and…

Methodology · Statistics 2016-09-06 Yi-Hui Zhou

Pearson Chi-squared Conditional Randomization Test

Conditional independence (CI) testing arises naturally in many scientific problems and applications domains. The goal of this problem is to investigate the conditional independence between a response variable $Y$ and another variable $X$,…

Methodology · Statistics 2025-10-07 Adel Javanmard , Mohammad Mehrabi

Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting

We introduce a framework for robust uncertainty quantification in situations where labeled training data are corrupted, through noisy or missing labels. We build on conformal prediction, a statistical tool for generating prediction sets…

Machine Learning · Computer Science 2026-02-27 Shai Feldman , Stephen Bates , Yaniv Romano

A covariate-adaptive test for replicability across multiple studies with false discovery rate control

Replicability is a lynchpin for credible discoveries. The partial conjunction (PC) p-value, which combines individual base p-values from multiple similar studies, can gauge whether a feature of interest exhibits replicated signals across…

Methodology · Statistics 2025-07-29 Ninh Tran , Dennis Leung

Covariate powered cross-weighted multiple testing

A fundamental task in the analysis of datasets with many variables is screening for associations. This can be cast as a multiple testing task, where the objective is achieving high detection power while controlling type I error. We consider…

Methodology · Statistics 2021-09-01 Nikolaos Ignatiadis , Wolfgang Huber

A Two-Sample Conditional Distribution Test Using Conformal Prediction and Weighted Rank Sum

We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and…

Methodology · Statistics 2023-02-24 Xiaoyu Hu , Jing Lei

Doubly robust matching estimators for high dimensional confounding adjustment

Valid estimation of treatment effects from observational data requires proper control of confounding. If the number of covariates is large relative to the number of observations, then controlling for all available covariates is infeasible.…

Methodology · Statistics 2018-01-11 Joseph Antonelli , Matthew Cefalu , Nathan Palmer , Denis Agniel

Doubly Robust Inference for Hazard Ratio under Informative Censoring with Machine Learning

Randomized clinical trials with time-to-event outcomes have traditionally used the log-rank test followed by the Cox proportional hazards (PH) model to estimate the hazard ratio between the treatment groups. These are valid under the…

Methodology · Statistics 2022-06-07 Jiyu Luo , Ronghui Xu

Selecting Robust Features for Machine Learning Applications using Multidata Causal Discovery

Robust feature selection is vital for creating reliable and interpretable Machine Learning (ML) models. When designing statistical prediction models in cases where domain knowledge is limited and underlying interactions are unknown,…

Machine Learning · Statistics 2023-07-03 Saranya Ganesh S. , Tom Beucler , Frederick Iat-Hin Tam , Milton S. Gomez , Jakob Runge , Andreas Gerhardus

Effective Positive Cauchy Combination Test

In the field of multiple hypothesis testing, combining p-values represents a fundamental statistical method. The Cauchy combination test (CCT) (Liu and Xie, 2020) excels among numerous methods for combining p-values with powerful and…

Methodology · Statistics 2024-10-17 Yanyan Ouyang , Xingwei Liu , Lixing Zhu , Wangli Xu

On the power of conditional independence testing under model-X

For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the recently introduced model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs…

Statistics Theory · Mathematics 2022-11-01 Eugene Katsevich , Aaditya Ramdas

A Power Analysis of the Conditional Randomization Test and Knockoffs

In many scientific problems, researchers try to relate a response variable $Y$ to a set of potential explanatory variables $X = (X_1,\dots,X_p)$, and start by trying to identify variables that contribute to this relationship. In statistical…

Statistics Theory · Mathematics 2020-10-07 Wenshuo Wang , Lucas Janson

The Hardness of Conditional Independence Testing and the Generalised Covariance Measure

It is a common saying that testing for conditional independence, i.e., testing whether whether two random vectors $X$ and $Y$ are independent, given $Z$, is a hard statistical problem if $Z$ is a continuous random variable (or vector). In…

Statistics Theory · Mathematics 2022-03-25 Rajen D. Shah , Jonas Peters

Robust high dimensional factor models with applications to statistical machine learning

Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance. As data are…

Methodology · Statistics 2018-08-14 Jianqing Fan , Kaizheng Wang , Yiqiao Zhong , Ziwei Zhu