Related papers: Statistical inference after variable selection in …
Variable selection for regression models plays a key role in the analysis of biomedical data. However, inference after selection is not covered by classical statistical frequentist theory which assumes a fixed set of covariates in the…
We develop a post-selection inference method for the Cox proportional hazards model with interval-censored data, which provides asymptotically valid p-values and confidence intervals conditional on the model selected by lasso. The method is…
For statistical inference on regression models with a diverging number of covariates, the existing literature typically makes sparsity assumptions on the inverse of the Fisher information matrix. Such assumptions, however, are often…
We develop a general approach to valid inference after model selection. At the core of our framework is a result that characterizes the distribution of a post-selection estimator conditioned on the selection event. We specialize the…
The problem of how to best select variables for confounding adjustment forms one of the key challenges in the evaluation of exposure effects in observational studies, and has been the subject of vigorous recent activity in causal inference.…
We develop tools to do valid post-selective inference for a family of model selection procedures, including choosing a model via cross-validated Lasso. The tools apply universally when the following random vectors are jointly asymptotically…
Fulfilling the promise of precision medicine requires accurately and precisely classifying disease states. For cancer, this includes prediction of survival time from a surfeit of covariates. Such data presents an opportunity for improved…
The analysis of randomized trials with time-to-event endpoints is nearly always plagued by the problem of censoring. As the censoring mechanism is usually unknown, analyses typically employ the assumption of non-informative censoring. While…
Variable selection problem for the nonlinear Cox regression model is considered. In survival analysis, one main objective is to identify the covariates that are associated with the risk of experiencing the event of interest. The Cox…
We develop methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the…
The instability in the selection of models is a major concern with data sets containing a large number of covariates. This paper deals with variable selection methodology in the case of high-dimensional problems where the response variable…
Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…
Selective inference methods are developed for group lasso estimators for use with a wide class of distributions and loss functions. The method includes the use of exponential family distributions, as well as quasi-likelihood modeling for…
We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances.…
The Tweedie exponential dispersion family is a popular choice among many to model insurance losses that consist of zero-inflated semicontinuous data. In such data, it is often important to obtain credibility (inference) of the most…
This thesis studies two problems in modern statistics. First, we study selective inference, or inference for hypothesis that are chosen after looking at the data. The motiving application is inference for regression coefficients selected by…
Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…
We consider high-dimensional inference for potentially misspecified Cox proportional hazard models based on low dimensional results by Lin and Wei [1989]. A de-sparsified Lasso estimator is proposed based on the log partial likelihood…
When the model is not known and parameter testing or interval estimation is conducted after model selection, it is necessary to consider selective inference. This paper discusses this issue in the context of sparse estimation. Firstly, we…
We develop a set of variable selection methods for the Cox model under interval censoring, in the ultra-high dimensional setting where the dimensionality can grow exponentially with the sample size. The methods select covariates via a…