English
Related papers

Related papers: Large Scale Correlation Screening

200 papers

We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small…

Machine Learning · Statistics 2013-04-11 Hamed Firouzi , Bala Rajaratnam , Alfred Hero

Independence screening methods such as the two sample $t$-test and the marginal correlation based ranking are among the most widely used techniques for variable selection in ultrahigh dimensional data sets. In this short note, simple…

Methodology · Statistics 2020-11-17 Run Wang , Somak Dutta , Vivekananda Roy

Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…

Methodology · Statistics 2017-10-12 Tianzhou Ma , Zhao Ren , George C. Tseng

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…

Methodology · Statistics 2016-11-29 Haeran Cho , Piotr Fryzlewicz

Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to…

Statistics Theory · Mathematics 2020-07-07 Talal Ahmed , Waheed U. Bajwa

In recent years we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a…

Methodology · Statistics 2025-02-07 Andrea Bratsberg , Abhik Ghosh , Magne Thoresen

Identifying multivariate dependencies in high-dimensional data is an important problem in large-scale inference. This problem has motivated recent advances in mining (partial) correlations, which focus on the challenging ultra-high…

Methodology · Statistics 2025-09-23 Emily Neo , Peter Radchenko , Bala Rajaratnam

In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…

Methodology · Statistics 2019-02-13 Yu-Hsiang Cheng , Tzee-Ming Huang , Su-Yun Huang

Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample…

Statistics Theory · Mathematics 2022-08-25 Yun Wei , Bala Rajaratnam , Alfred O. Hero

In variable selection, most existing screening methods focus on marginal effects and ignore dependence between covariates. To improve the performance of selection, we incorporate pairwise effects in covariates for screening and…

Methodology · Statistics 2019-02-12 Siliang Gong , Kai Zhang , Yufeng Liu

Feature screening is an important method to reduce the dimension and capture informative variables in ultrahigh-dimensional data analysis. Many methods have been developed for feature screening. These methods, however, are challenged by…

Methodology · Statistics 2019-01-08 Li-Pang Chen

Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p, small n" paradigms when p can be as…

Methodology · Statistics 2012-10-18 Gaorong Li , Heng Peng , Jun Zhang , Lixing Zhu

High-dimensional data are commonly seen in modern statistical applications, variable selection methods play indispensable roles in identifying the critical features for scientific discoveries. Traditional best subset selection methods are…

Methodology · Statistics 2022-12-29 Tianzhou Ma , Hongjie Ke , Zhao Ren

High dimensional time series datasets are becoming increasingly common in various fields such as economics, finance, meteorology, and neuroscience. Given this ubiquity of time series data, it is surprising that very few works on variable…

Methodology · Statistics 2018-04-17 Kashif Yousuf , Yang Feng

As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take…

Methodology · Statistics 2021-07-27 Xiaochao Xia

Feature screening is an important tool in analyzing ultrahigh-dimensional data, particularly in the field of Omics and oncology studies. However, most attention has been focused on identifying features that have a linear or monotonic impact…

Methodology · Statistics 2023-05-10 Yaxian Chen , KF Lam , Zhonghua Liu

Differential co-expression analysis has been widely applied by scientists in understanding the biological mechanisms of diseases. However, the unknown differential patterns are often complicated; thus, models based on simplified parametric…

Methodology · Statistics 2022-01-13 Tianxi Li , Xiwei Tang , Ajay Chatrath

Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical…

Methodology · Statistics 2015-10-22 T. Tony Cai , Anru Zhang

Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to…

Statistics Theory · Mathematics 2015-06-09 Xiangyu Wang , Chenlei Leng , David B. Dunson

In this paper we propose a linear variable screening method for computer experiments when the number of input variables is larger than the number of runs. This method uses a linear model to model the nonlinear data, and screens the…

Methodology · Statistics 2020-06-16 Chunya Li , Daijun Chen , Shifeng Xiong
‹ Prev 1 2 3 10 Next ›