Related papers: Extending Model-x Framework to Missing Data

Robust inference with knockoffs

We consider the variable selection problem, which seeks to identify important variables influencing a response $Y$ out of many candidate features $X_1, \ldots, X_p$. We wish to do so while offering finite-sample guarantees about the…

Methodology · Statistics 2019-02-12 Rina Foygel Barber , Emmanuel J. Candès , Richard J. Samworth

Relaxing the Assumptions of Knockoffs by Conditioning

The recent paper Cand\`es et al. (2018) introduced model-X knockoffs, a method for variable selection that provably and non-asymptotically controls the false discovery rate with no restrictions or assumptions on the dimensionality of the…

Methodology · Statistics 2020-06-16 Dongming Huang , Lucas Janson

Metropolized Knockoff Sampling

Model-X knockoffs is a wrapper that transforms essentially any feature importance measure into a variable selection algorithm, which discovers true effects while rigorously controlling the expected fraction of false positives. A frequently…

Methodology · Statistics 2024-03-12 Stephen Bates , Emmanuel Candès , Lucas Janson , Wenshuo Wang

Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection

Many contemporary large-scale applications involve building interpretable models linking a large set of potential covariates to a response in a nonlinear fashion, such as when the response is binary. Although this modeling problem has been…

Methodology · Statistics 2017-12-13 Emmanuel Candes , Yingying Fan , Lucas Janson , Jinchi Lv

Knockoffs Inference under Privacy Constraints

Model-X knockoff framework offers a model-free variable selection method that ensures finite sample false discovery rate (FDR) control. However, the complexity of generating knockoff variables, coupled with the model-free assumption,…

Methodology · Statistics 2025-06-12 Zhanrui Cai , Yingying Fan , Lan Gao

Variable selection via knockoffs in missing data settings with categorical predictors

Large-scale assessment data typically include numerous categorical variables, often affected by missing values. Motivated by the challenges arising in this framework, we extend the knockoffs method for selecting predictors to settings with…

Methodology · Statistics 2026-05-13 Silvia Bacci , Emanuela Dreassi , Leonardo Grilli , Carla Rampichini

Multiple Model-Free Knockoffs

Model-free knockoffs is a recently proposed technique for identifying covariates that is likely to have an effect on a response variable. The method is an efficient method to control the false discovery rate in hypothesis tests for separate…

Methodology · Statistics 2019-03-29 Lars Holden , Kristoffer Hellton

Error-based Knockoffs Inference for Controlled Feature Selection

Recently, the scheme of model-X knockoffs was proposed as a promising solution to address controlled feature selection under high-dimensional finite-sample settings. However, the procedure of model-X knockoffs depends heavily on the…

Methodology · Statistics 2022-03-10 Xuebin Zhao , Hong Chen , Yingjie Wang , Weifu Li , Tieliang Gong , Yulong Wang , Feng Zheng

Robust Knockoffs for Controlling False Discoveries With an Application to Bond Recovery Rates

We address challenges in variable selection with highly correlated data that are frequently present in finance, economics, but also in complex natural systems as e.g. weather. We develop a robustified version of the knockoff framework,…

Econometrics · Economics 2022-06-14 Konstantin Görgen , Abdolreza Nazemi , Melanie Schienle

Deep Knockoffs

This paper introduces a machine for sampling approximate model-X knockoffs for arbitrary and unspecified data distributions using deep generative models. The main idea is to iteratively refine a knockoff sampling mechanism until a criterion…

Methodology · Statistics 2020-03-03 Yaniv Romano , Matteo Sesia , Emmanuel J. Candès

When Knockoffs fail: diagnosing and fixing non-exchangeability of Knockoffs

Knockoffs are a popular statistical framework that addresses the challenging problem of conditional variable selection in high-dimensional settings with statistical control. Such statistical control is essential for the reliability of…

Methodology · Statistics 2025-04-30 Alexandre Blain , Angel Reyero Lobo , Julia Linhart , Bertrand Thirion , Pierre Neuvial

An ensemble learning method for variable selection: application to high dimensional data and missing values

Standard approaches for variable selection in linear models are not tailored to deal properly with high-dimensional and incomplete data. Currently, methods dedicated to high-dimensional data handle missing values by ad-hoc strategies, like…

Methodology · Statistics 2021-06-09 Avner Bar-Hen , Vincent Audigier

Powerful Knockoffs via Minimizing Reconstructability

Model-X knockoffs allows analysts to perform feature selection using almost any machine learning algorithm while still provably controlling the expected proportion of false discoveries. To apply model-X knockoffs, one must construct…

Methodology · Statistics 2021-06-30 Asher Spector , Lucas Janson

Sharing pattern submodels for prediction with missing values

Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as…

Machine Learning · Computer Science 2023-11-27 Lena Stempfle , Ashkan Panahi , Fredrik D. Johansson

CoxKnockoff: Controlled Feature Selection for the Cox Model Using Knockoffs

Although there is a huge literature on feature selection for the Cox model, none of the existing approaches can control the false discovery rate (FDR) unless the sample size tends to infinity. In addition, there is no formal power analysis…

Methodology · Statistics 2023-08-02 Daoji Li , Jinzhao Yu , Hui Zhao

Searching for local associations while controlling the false discovery rate

We introduce local conditional hypotheses that express how the relation between explanatory variables and outcomes changes across different contexts, described by covariates. By expanding upon the model-X knockoff filter, we show how to…

Methodology · Statistics 2026-01-12 Paula Gablenz , Matteo Sesia , Tianshu Sun , Chiara Sabatti

Derandomizing Knockoffs

Model-X knockoffs is a general procedure that can leverage any feature importance measure to produce a variable selection algorithm, which discovers true effects while rigorously controlling the number or fraction of false positives.…

Methodology · Statistics 2020-12-07 Zhimei Ren , Yuting Wei , Emmanuel Candès

On the Construction of Knockoffs in Case-Control Studies

Consider a case-control study in which we have a random sample, constructed in such a way that the proportion of cases in our sample is different from that in the general population---for instance, the sample is constructed to achieve a…

Methodology · Statistics 2019-01-01 Rina Foygel Barber , Emmanuel Candes

Functional knockoffs selection with applications to functional data analysis in high dimensions

The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or…

Methodology · Statistics 2024-06-28 Xinghao Qiao , Mingya Long , Qizhai Li

An integrated approach to test for missing not at random

Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a…

Methodology · Statistics 2022-12-08 Jack Noonan , Adetola Adedamola Adediran , Robin Mitra , Stefanie Biedermann