Related papers: Variable screening using factor analysis for high-…

Robust variable screening for regression using factor profiling

Sure Independence Screening is a fast procedure for variable selection in ultra-high dimensional regression analysis. Unfortunately, its performance greatly deteriorates with increasing dependence among the predictors. To solve this issue,…

Methodology · Statistics 2018-11-15 Yixin Wang , Stefan Van Aelst

Conditional Sure Independence Screening

Independence screening is a powerful method for variable selection for `Big Data' when the number of variables is massive. Commonly used independence screening methods are based on marginal correlations or variations of it. In many…

Statistics Theory · Mathematics 2012-11-02 Emre Barut , Jianqing Fan , Anneleen Verhasselt

Screening Methods for Classification Based on Non-parametric Bayesian Tests

Feature or variable selection is a problem inherent to large data sets. While many methods have been proposed to deal with this problem, some can scale poorly with the number of predictors in a data set. Screening methods scale linearly…

Methodology · Statistics 2023-01-09 Naveed Merchant , Jeffrey D. Hart

Variable screening with multiple studies

Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…

Methodology · Statistics 2017-10-12 Tianzhou Ma , Zhao Ren , George C. Tseng

Greedy Forward Regression for Variable Screening

Two popular variable screening methods under the ultra-high dimensional setting with the desirable sure screening property are the sure independence screening (SIS) and the forward regression (FR). Both are classical variable screening…

Methodology · Statistics 2015-11-05 Ming-Yen Cheng , Sanying Feng , Gaorong Li , Heng Lian

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models

The varying-coefficient model is an important nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is big, the issue of variable selection…

Statistics Theory · Mathematics 2013-03-05 Jianqing Fan , Yunbei Ma , Wei Dai

Robust rank correlation based screening

Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p, small n" paradigms when p can be as…

Methodology · Statistics 2012-10-18 Gaorong Li , Heng Peng , Jun Zhang , Lixing Zhu

Ultrahigh dimensional variable selection: beyond the linear model

Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking…

Methodology · Statistics 2008-12-18 Jianqing Fan , Richard Samworth , Yichao Wu

Predictive Correlation Screening: Application to Two-stage Predictor Design in High Dimension

We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small…

Machine Learning · Statistics 2013-04-11 Hamed Firouzi , Bala Rajaratnam , Alfred Hero

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models

A variable screening procedure via correlation learning was proposed Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To…

Methodology · Statistics 2011-01-19 Jianqing Fan , Yang Feng , Rui Song

A robust variable screening procedure for ultra-high dimensional data

Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre screening of the variables may be necessary. A…

Methodology · Statistics 2020-05-01 Abhik Ghosh , Magne Thoresen

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu

A note on marginal correlation based screening

Independence screening methods such as the two sample $t$-test and the marginal correlation based ranking are among the most widely used techniques for variable selection in ultrahigh dimensional data sets. In this short note, simple…

Methodology · Statistics 2020-11-17 Run Wang , Somak Dutta , Vivekananda Roy

Bayesian dynamic variable selection in high dimensions

This paper proposes a variational Bayes algorithm for computationally efficient posterior and predictive inference in time-varying parameter (TVP) models. Within this context we specify a new dynamic variable/model selection strategy for…

Computation · Statistics 2021-12-23 Gary Koop , Dimitris Korobilis

Random Partitioning and Distribution-based Thresholding for Iterative Variable Screening in High Dimensions

In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…

Methodology · Statistics 2019-02-13 Yu-Hsiang Cheng , Tzee-Ming Huang , Su-Yun Huang

Conditional variable screening for ultra-high dimensional longitudinal data with time interactions

In recent years we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a…

Methodology · Statistics 2025-02-07 Andrea Bratsberg , Abhik Ghosh , Magne Thoresen

Iterative variable selection for high-dimensional data with binary outcomes

We propose an iterative variable selection scheme for high-dimensional data with binary outcomes. The scheme adopts a structured screen-and-select framework and uses non-local prior-based Bayesian model selection within the same. The…

Methodology · Statistics 2022-11-08 Nilotpal Sanyal

PPFS: Predictive Permutation Feature Selection

We propose Predictive Permutation Feature Selection (PPFS), a novel wrapper-based feature selection method based on the concept of Markov Blanket (MB). Unlike previous MB methods, PPFS is a universal feature selection technique as it can…

Machine Learning · Computer Science 2024-10-11 Atif Hassan , Jiaul H. Paik , Swanand Khare , Syed Asif Hassan

Are screening methods useful in feature selection? An empirical study

Filter or screening methods are often used as a preprocessing step for reducing the number of variables used by a learning algorithm in obtaining a classification or regression model. While there are many such filter methods, there is a…

Machine Learning · Statistics 2019-09-13 Mingyuan Wang , Adrian Barbu

Adaptive partition Factor Analysis

Factor Analysis has traditionally been utilized across diverse disciplines to extrapolate latent traits that influence the behavior of multivariate observed variables. Historically, the focus has been on analyzing data from a single study,…

Methodology · Statistics 2026-01-22 Elena Bortolato , Antonio Canale