Related papers: Large-P Variable Selection in Two-Stage Models

A Two-Stage Variable Selection Approach for Correlated High Dimensional Predictors

When fitting statistical models, some predictors are often found to be correlated with each other, and functioning together. Many group variable selection methods are developed to select the groups of predictors that are closely related to…

Methodology · Statistics 2021-03-25 Zhiyuan Li

Which bridge estimator is optimal for variable selection?

We study the problem of variable selection for linear models under the high-dimensional asymptotic setting, where the number of observations $n$ grows at the same rate as the number of predictors $p$. We consider two-stage variable…

Statistics Theory · Mathematics 2020-03-27 Shuaiwen Wang , Haolei Weng , Arian Maleki

Robustness and efficiency of covariate adjusted linear instrumental variable estimators

Two-stage least squares (TSLS) estimators and variants thereof are widely used to infer the effect of an exposure on an outcome using instrumental variables (IVs). They belong to a wider class of two-stage IV estimators, which are based on…

Methodology · Statistics 2015-10-08 Stijn Vansteelandt , Vanessa Didelez

High-dimensional variable selection

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression…

Statistics Theory · Mathematics 2009-08-20 Larry Wasserman , Kathryn Roeder

Beyond Support in Two-Stage Variable Selection

Numerous variable selection methods rely on a two-stage procedure, where a sparsity-inducing penalty is used in the first stage to predict the support, which is then conveyed to the second stage for estimation or inference purposes. In this…

Applications · Statistics 2015-05-28 Jean-Michel Bécu , Yves Grandvalet , Christophe Ambroise , Cyril Dalmasso

A High-dimensional M-estimator Framework for Bi-level Variable Selection

In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to…

Methodology · Statistics 2021-09-14 Bin Luo , Xiaoli Gao

Two-Stage Testing in a high dimensional setting

In a high dimensional regression setting in which the number of variables ($p$) is much larger than the sample size ($n$), the number of possible two-way interactions between the variables is immense. If the number of variables is in the…

Methodology · Statistics 2024-06-26 Marianne A Jonker , Luc van Schijndel , Eric Cator

IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery

In the presence of confounding between an endogenous variable and the outcome, instrumental variables (IVs) are used to isolate the causal effect of the endogenous variable. Identifying valid instruments requires interdisciplinary…

Artificial Intelligence · Computer Science 2026-04-07 Ivaxi Sheth , Zhijing Jin , Bryan Wilder , Dominik Janzing , Mario Fritz

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

One of the most important problems in system identification and statistics is how to estimate the unknown parameters of a given model. Optimization methods and specialized procedures, such as Empirical Minimization (EM) can be used in case…

Methodology · Statistics 2024-02-09 Braghadeesh Lakshminarayanan , Cristian R. Rojas

Two-step estimation of latent trait models

We consider likelihood-based two-step estimation of latent variable models, in which just the measurement model is estimated in the first step and the measurement parameters are then fixed at their estimated values in the second step where…

Methodology · Statistics 2025-08-26 Jouni Kuha , Zsuzsa Bakk

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which…

Methodology · Statistics 2021-07-09 Kaixu Yang , Tapabrata Maiti

Variable Selection for High-dimensional Generalized Linear Models using an Iterated Conditional Modes/Medians Algorithm

High-dimensional linear and nonlinear models have been extensively used to identify associations between response and explanatory variables. The variable selection problem is commonly of interest in the presence of massive and complex data.…

Methodology · Statistics 2017-08-10 Vitara Pungpapong , Min Zhang , Dabao Zhang

Mining Causality: AI-Assisted Search for Instrumental Variables

The instrumental variables (IVs) method is a leading empirical strategy for causal inference. Finding IVs is a heuristic and creative process, and justifying its validity -- especially exclusion restrictions -- is largely rhetorical. We…

Econometrics · Economics 2025-06-06 Sukjin Han

Optimal Two-Step Prediction in Regression

High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning…

Methodology · Statistics 2017-06-07 Didier Chételat , Johannes Lederer , Joseph Salmon

Variable Selection for Comparing High-dimensional Time-Series Data

Given a pair of multivariate time-series data of the same length and dimensions, an approach is proposed to select variables and time intervals where the two series are significantly different. In applications where one time series is an…

Methodology · Statistics 2024-12-11 Kensuke Mitsuzawa , Margherita Grossi , Stefano Bortoli , Motonobu Kanagawa

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Bayesian variable selection in high dimensional problems without assumptions on prior model probabilities

We consider the problem of variable selection in linear models when $p$, the number of potential regressors, may exceed (and perhaps substantially) the sample size $n$ (which is possibly small).

Methodology · Statistics 2016-07-12 James O. Berger , Gonzalo Garcia-Donato , Miguel A. Martinez-Beneito , Victor Peña

A Scalable Empirical Bayes Approach to Variable Selection

We develop a model-based empirical Bayes approach to variable selection problems in which the number of predictors is very large, possibly much larger than the number of responses (the so-called 'large p, small n' problem). We consider the…

Methodology · Statistics 2015-10-14 Haim Y. Bar , James G. Booth , Martin T. Wells

Bias Reduction in Instrumental Variable Estimation through First-Stage Shrinkage

The two-stage least-squares (2SLS) estimator is known to be biased when its first-stage fit is poor. I show that better first-stage prediction can alleviate this bias. In a two-stage linear regression model with Normal noise, I consider…

Statistics Theory · Mathematics 2017-11-01 Jann Spiess

A Note on High Dimensional Linear Regression with Interactions

The problem of interaction selection has recently caught much attention in high dimensional data analysis. This note aims to address and clarify several fundamental issues in interaction selection for linear regression models, especially…

Methodology · Statistics 2015-10-08 Ning Hao , Hao Helen Zhang