Related papers: Large Scale Correlation Screening
We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small…
Independence screening methods such as the two sample $t$-test and the marginal correlation based ranking are among the most widely used techniques for variable selection in ultrahigh dimensional data sets. In this short note, simple…
Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…
The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…
Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to…
In recent years we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a…
Identifying multivariate dependencies in high-dimensional data is an important problem in large-scale inference. This problem has motivated recent advances in mining (partial) correlations, which focus on the challenging ultra-high…
In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…
Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample…
In variable selection, most existing screening methods focus on marginal effects and ignore dependence between covariates. To improve the performance of selection, we incorporate pairwise effects in covariates for screening and…
Feature screening is an important method to reduce the dimension and capture informative variables in ultrahigh-dimensional data analysis. Many methods have been developed for feature screening. These methods, however, are challenged by…
Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p, small n" paradigms when p can be as…
High-dimensional data are commonly seen in modern statistical applications, variable selection methods play indispensable roles in identifying the critical features for scientific discoveries. Traditional best subset selection methods are…
High dimensional time series datasets are becoming increasingly common in various fields such as economics, finance, meteorology, and neuroscience. Given this ubiquity of time series data, it is surprising that very few works on variable…
As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take…
Feature screening is an important tool in analyzing ultrahigh-dimensional data, particularly in the field of Omics and oncology studies. However, most attention has been focused on identifying features that have a linear or monotonic impact…
Differential co-expression analysis has been widely applied by scientists in understanding the biological mechanisms of diseases. However, the unknown differential patterns are often complicated; thus, models based on simplified parametric…
Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical…
Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to…
In this paper we propose a linear variable screening method for computer experiments when the number of input variables is larger than the number of runs. This method uses a linear model to model the nonlinear data, and screens the…