Related papers: When Is the First Spurious Variable Selected by Se…
Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…
We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response…
Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…
We propose a computationally intensive method, the random lasso method, for variable selection in linear models. The method consists of two major steps. In step 1, the lasso method is applied to many bootstrap samples, each using a set of…
We propose the variable selection procedure incorporating prior constraint information into lasso. The proposed procedure combines the sample and prior information, and selects significant variables for responses in a narrower region where…
Inference for high-dimensional logistic regression models using penalized methods has been a challenging research problem. As an illustration, a major difficulty is the significant bias of the Lasso estimator, which limits its direct…
Given data $y$ and $k$ covariates $x$ one problem in linear regression is to decide which in any of the covariates to include when regressing $y$ on the $x$. If $k$ is small it is possible to evaluate each subset of the $x$. If however $k$…
Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model…
We study the problem of high-dimensional variable selection via some two-step procedures. First we show that given some good initial estimator which is $\ell_{\infty}$-consistent but not necessarily variable selection consistent, we can…
High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning…
Causal effect estimation is a critical task in statistical learning that aims to find the causal effect on subjects by identifying causal links between a number of predictor (or, explanatory) variables and the outcome of a treatment. In a…
An important problem in the analysis of high-dimensional omics data is to identify subsets of molecular variables that are associated with a phenotype of interest. This requires addressing the challenges of high dimensionality, strong…
Modern variable selection procedures make use of penalization methods to execute simultaneous model selection and estimation. A popular method is the LASSO (least absolute shrinkage and selection operator), the use of which requires…
We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection…
A great deal of interest has recently focused on conducting inference on the parameters in a high-dimensional linear model. In this paper, we consider a simple and very na\"{i}ve two-step procedure for this task, in which we (i) fit a lasso…
Sparse linear regression is a vast field and there are many different algorithms available to build models. Two new papers published in Statistical Science study the comparative performance of several sparse regression methodologies,…
Spurious association between X and Y may be due to a confounding variable W. Statisticians may adjust for W using a variety of techniques. This paper presents the results of simulations conducted to assess the performance of those…
We study regression discontinuity designs in which many predetermined covariates, possibly much more than the number of observations, can be used to increase the precision of treatment effect estimates. We consider a two-step estimator…
The lasso is a popular tool for sparse linear regression, especially for problems in which the number of variables p exceeds the number of observations n. But when p>n, the lasso criterion is not strictly convex, and hence it may not have a…
The lasso has become an important practical tool for high dimensional regression as well as the object of intense theoretical investigation. But despite the availability of efficient algorithms, the lasso remains computationally demanding…