Related papers: Significance Testing and Group Variable Selection
Let X be a d dimensional vector of covariates and Y be the response variable. Under the nonparametric model Y = m(X) + {\sigma}(X) \in we develop an ANOVA-type test for the null hypothesis that a particular coordinate of X has no influence…
Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…
In this paper, we develop invariance-based procedures for testing and inference in high-dimensional regression models. These procedures, also known as randomization tests, provide several important advantages. First, for the global null…
The logical and practical difficulties associated with research interpretation using P values and null hypothesis significance testing have been extensively documented. This paper describes an alternative, likelihood-based approach to…
We develop methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the…
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test…
In this paper, we propose a test procedure based on the LASSO methodology to test the global null hypothesis of no dependence between a response variable and $p$ predictors, where $n$ observations with $n < p$ are available. The proposed…
In many engineering and scientific applications, prediction variables are grouped, for example, in biological applications where assayed genes or proteins can be grouped by biological roles or biological pathways. Common statistical…
In this paper, we are concerned with how to select significant variables in semiparametric modeling. Variable selection for semiparametric regression models consists of two components: model selection for nonparametric components and…
Consider a random vector $(X,Y)$ and let $m(x)=E(Y|X=x)$. We are interested in testing $H_0:m\in {\cal M}_{\Theta,{\cal G}}=\{\gamma(\cdot,\theta,g):\theta \in \Theta,g\in {\cal G}\}$ for some known function $\gamma$, some compact set…
Testing for the significance of a subset of regression coefficients in a linear model, a staple of statistical analysis, goes back at least to the work of Fisher who introduced the analysis of variance (ANOVA). We study this problem under…
We propose a test of the significance of a variable appearing on the Lasso path and use it in a procedure for selecting one of the models of the Lasso path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a set A of…
Motivated by the importance of measuring the association between the response and predictors in high dimensional data, In this article, we propose a new mean variance test of independence between a categorical random variable and a…
We generalize Levene's test for variance (scale) heterogeneity between $k$ groups for more complex data, which includes sample correlation and group membership uncertainty. Following a two-stage regression framework, we show that least…
This article investigates uncertainty quantification of the generalized linear lasso~(GLL), a popular variable selection method in high-dimensional regression settings. In many fields of study, researchers use data-driven methods to select…
We propose a novel method for testing the null hypothesis of no effect of a covariate on the response in the context of functional linear concurrent regression. We establish an equivalent random effects formulation of our functional…
We propose a new testing procedure of heteroskedasticity in high-dimensional linear regression, where the number of covariates can be larger than the sample size. Our testing procedure is based on residuals of the Lasso. We demonstrate that…
We propose a general, modular method for significance testing of groups (or clusters) of variables in a high-dimensional linear model. In presence of high correlations among the covariables, due to serious problems of identifiability, it is…
We consider the problem of testing significance of predictors in multivariate nonparametric quantile regression. A stochastic process is proposed, which is based on a comparison of the responses with a nonparametric quantile regression…
We propose a new measure of variable importance in high-dimensional regression based on the change in the LASSO solution path when one covariate is left out. The proposed procedure provides a novel way to calculate variable importance and…