Related papers: Large Scale Partial Correlation Screening with Unc…
We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small…
This paper treats the problem of screening for variables with high correlations in high dimensional data in which there can be many fewer samples than variables. We focus on threshold-based correlation screening methods for three related…
Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample…
This paper studies the distributed conditional feature screening for massive data with ultrahigh-dimensional features. Specifically, three distributed partial correlation feature screening methods (SAPS, ACPS and JDPS methods) are firstly…
High dimensional time series datasets are becoming increasingly common in various fields such as economics, finance, meteorology, and neuroscience. Given this ubiquity of time series data, it is surprising that very few works on variable…
This paper proposes a novel testing procedure for selecting a sparse set of covariates that explains a large dimensional panel. Our selection method provides correct false detection control while having higher power than existing…
As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take…
The identification of the dependent components in multiple data sets is a fundamental problem in many practical applications. The challenge in these applications is that often the data sets are high-dimensional with few observations or…
This paper advances a variable screening approach to enhance conditional quantile forecasts using high-dimensional predictors. We have refined and augmented the quantile partial correlation (QPC)-based variable screening proposed by Ma et…
Multiple hypothesis testing, a situation when we wish to consider many hypotheses, is a core problem in statistical inference that arises in almost every scientific field. In this setting, controlling the false discovery rate (FDR), which…
In variable selection, most existing screening methods focus on marginal effects and ignore dependence between covariates. To improve the performance of selection, we incorporate pairwise effects in covariates for screening and…
Controlling the false discovery rate (FDR) is a powerful approach to multiple testing. In many applications, the tested hypotheses have an inherent hierarchical structure. In this paper, we focus on the fixed sequence structure where the…
Screening for ultrahigh dimensional features may encounter complicated issues such as outlying observations, heteroscedasticity or heavy-tailed distribution, multi-collinearity and confounding effects. Standard correlation-based marginal…
Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any…
Replicability is a lynchpin for credible discoveries. The partial conjunction (PC) p-value, which combines individual base p-values from multiple similar studies, can gauge whether a feature of interest exhibits replicated signals across…
Feature screening is useful and popular to detect informative predictors for ultrahigh-dimensional data before developing proceeding statistical analysis or constructing statistical models. While a large body of feature screening procedures…
Many approaches for multiple testing begin with the assumption that all tests in a given study should be combined into a global false-discovery-rate analysis. But this may be inappropriate for many of today's large-scale screening problems,…
Confounding matters in almost all observational studies that focus on causality. In order to eliminate bias caused by connfounders, oftentimes a substantial number of features need to be collected in the analysis. In this case, large p…
Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller…
Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection.…