Related papers: When Is the First Spurious Variable Selected by Se…

Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…

Methodology · Statistics 2018-06-19 X. Jessie Jeng , Huimin Peng , Wenbin Lu

"Pre-conditioning" for feature selection and regression in high-dimensional problems

We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response…

Statistics Theory · Mathematics 2013-04-16 Debashis Paul , Eric Bair , Trevor Hastie , Robert Tibshirani

Selective Inference in Propensity Score Analysis

Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…

Methodology · Statistics 2021-11-25 Yoshiyuki Ninomiya , Yuta Umezu , Ichiro Takeuchi

Random lasso

We propose a computationally intensive method, the random lasso method, for variable selection in linear models. The method consists of two major steps. In step 1, the lasso method is applied to many bootstrap samples, each using a set of…

Applications · Statistics 2011-04-19 Sijian Wang , Bin Nan , Saharon Rosset , Ji Zhu

Variable Selection Incorporating Prior Constraint Information into Lasso

We propose the variable selection procedure incorporating prior constraint information into lasso. The proposed procedure combines the sample and prior information, and selects significant variables for responses in a narrower region where…

Methodology · Statistics 2011-02-19 Shurong Zheng , Guodong Song , Ning-Zhong Shi

Accurate Inference for Penalized Logistic Regression

Inference for high-dimensional logistic regression models using penalized methods has been a challenging research problem. As an illustration, a major difficulty is the significant bias of the Lasso estimator, which limits its direct…

Methodology · Statistics 2024-10-29 Yuming Zhang , Stéphane Guerrier , Runze Li

On stepwise regression

Given data $y$ and $k$ covariates $x$ one problem in linear regression is to decide which in any of the covariates to include when regressing $y$ on the $x$. If $k$ is small it is possible to evaluate each subset of the $x$. If however $k$…

Statistics Theory · Mathematics 2016-05-17 Patrick Laurie Davies

The Loss Rank Criterion for Variable Selection in Linear Regression Analysis

Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model…

Methodology · Statistics 2014-02-26 Minh-Ngoc Tran

Some Two-Step Procedures for Variable Selection in High-Dimensional Linear Regression

We study the problem of high-dimensional variable selection via some two-step procedures. First we show that given some good initial estimator which is $\ell_{\infty}$-consistent but not necessarily variable selection consistent, we can…

Statistics Theory · Mathematics 2008-10-10 Jian Zhang , Xinge Jessie Jeng , Han Liu

Optimal Two-Step Prediction in Regression

High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning…

Methodology · Statistics 2017-06-07 Didier Chételat , Johannes Lederer , Joseph Salmon

Robust Bayesian causal estimation for causal inference in medical diagnosis

Causal effect estimation is a critical task in statistical learning that aims to find the causal effect on subjects by identifying causal links between a number of predictor (or, explanatory) variables and the outcome of a treatment. In a…

Methodology · Statistics 2024-11-26 Tathagata Basu , Matthias C. M. Troffaes

Structural randomised selection

An important problem in the analysis of high-dimensional omics data is to identify subsets of molecular variables that are associated with a phenotype of interest. This requires addressing the challenges of high dimensionality, strong…

Methodology · Statistics 2022-04-05 Fan Wang , Sylvia Richardson , Steven M. Hill

Variable Selection Using a Smooth Information Criterion for Distributional Regression Models

Modern variable selection procedures make use of penalization methods to execute simultaneous model selection and estimation. A popular method is the LASSO (least absolute shrinkage and selection operator), the use of which requires…

Methodology · Statistics 2023-01-12 Meadhbh O'Neill , Kevin Burke

Exact Post-Selection Inference for Sequential Regression Procedures

We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection…

Methodology · Statistics 2015-10-13 Ryan J. Tibshirani , Jonathan Taylor , Richard Lockhart , Robert Tibshirani

In Defense of the Indefensible: A Very Naive Approach to High-Dimensional Inference

A great deal of interest has recently focused on conducting inference on the parameters in a high-dimensional linear model. In this paper, we consider a simple and very na\"{i}ve two-step procedure for this task, in which we (i) fit a lasso…

Methodology · Statistics 2020-07-02 Sen Zhao , Daniela Witten , Ali Shojaie

A Discussion on Practical Considerations with Sparse Regression Methodologies

Sparse linear regression is a vast field and there are many different algorithms available to build models. Two new papers published in Statistical Science study the comparative performance of several sparse regression methodologies,…

Machine Learning · Computer Science 2021-02-10 Owais Sarwar , Benjamin Sauk , Nikolaos V. Sahinidis

Adjustment with Three Continuous Variables

Spurious association between X and Y may be due to a confounding variable W. Statisticians may adjust for W using a variety of techniques. This paper presents the results of simulations conducted to assess the performance of those…

Methodology · Statistics 2023-10-11 Brian Knaeble

Inference in Regression Discontinuity Designs with High-Dimensional Covariates

We study regression discontinuity designs in which many predetermined covariates, possibly much more than the number of observations, can be used to increase the precision of treatment effect estimates. We consider a two-step estimator…

Econometrics · Economics 2022-05-06 Alexander Kreiß , Christoph Rothe

The Lasso Problem and Uniqueness

The lasso is a popular tool for sparse linear regression, especially for problems in which the number of variables p exceeds the number of observations n. But when p>n, the lasso criterion is not strictly convex, and hence it may not have a…

Statistics Theory · Mathematics 2012-11-06 Ryan J. Tibshirani

Revisiting Marginal Regression

The lasso has become an important practical tool for high dimensional regression as well as the object of intense theoretical investigation. But despite the availability of efficient algorithms, the lasso remains computationally demanding…

Statistics Theory · Mathematics 2009-11-23 Christopher Genovese , Jiashun Jin , Larry Wasserman