Related papers: High-dimensional Variable Screening via Conditiona…
Independence screening is a powerful method for variable selection for `Big Data' when the number of variables is massive. Commonly used independence screening methods are based on marginal correlations or variations of it. In many…
In recent years we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a…
We in this paper propose a directional regression based approach for ultrahigh dimensional sufficient variable screening with censored responses. The new method is designed in a model-free manner and thus can be adapted to various complex…
We consider the problem of variable screening in ultra-high dimensional generalized linear models (GLMs) of non-polynomial orders. Since the popular SIS approach is extremely unstable in the presence of contamination and noise, we discuss a…
We introduce a quantile-adaptive framework for nonlinear variable screening with high-dimensional heterogeneous data. This framework has two distinctive features: (1) it allows the set of active variables to vary across quantiles, thus…
In ultrahigh dimensional setting, independence screening has been both theoretically and empirically proved a useful variable selection framework with low computation cost. In this work, we propose a two-step framework by using marginal…
Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…
We propose a nonparametric approach to testing conditional independence and estimating conditional association, generalizing the Cochran-Mantel-Haenszel (CMH) test and odds-ratio estimator to continuous sample spaces. It leverages a…
Independence screening methods such as the two sample $t$-test and the marginal correlation based ranking are among the most widely used techniques for variable selection in ultrahigh dimensional data sets. In this short note, simple…
This article deals with the problem of testing conditional independence between two random vectors ${\bf X}$ and ${\bf Y}$ given a confounding random vector ${\bf Z}$. Several authors have considered this problem for multivariate data.…
Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p, small n" paradigms when p can be as…
High dimensional time series datasets are becoming increasingly common in various fields such as economics, finance, meteorology, and neuroscience. Given this ubiquity of time series data, it is surprising that very few works on variable…
In high dimensional analysis, effects of explanatory variables on responses sometimes rely on certain exposure variables, such as time or environmental factors. In this paper, to characterize the importance of each predictor, we utilize its…
Motivated by applications in biological science, we propose a novel test to assess the conditional mean dependence of a response variable on a large number of covariates. Our procedure is built on the martingale difference divergence…
This paper advances a variable screening approach to enhance conditional quantile forecasts using high-dimensional predictors. We have refined and augmented the quantile partial correlation (QPC)-based variable screening proposed by Ma et…
The varying-coefficient model is an important nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is big, the issue of variable selection…
Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to…
How to select the active variables which have significant impact on the event of interest is a very important and meaningful problem in the statistical analysis of ultrahigh-dimensional data. Sure independent screening procedure has been…
Feature screening approaches are effective in selecting active features from data with ultrahigh dimensionality and increasing complexity; however, the majority of existing feature screening approaches are either restricted to a univariate…
The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…