Related papers: Linear screening for high-dimensional computer exp…

Distributed variable screening for generalized linear models

In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed…

Methodology · Statistics 2024-05-09 Tianbo Diao , Lianqiang Qu , Bo Li , Liuquan Sun

Screening Methods for Classification Based on Non-parametric Bayesian Tests

Feature or variable selection is a problem inherent to large data sets. While many methods have been proposed to deal with this problem, some can scale poorly with the number of predictors in a data set. Screening methods scale linearly…

Methodology · Statistics 2023-01-09 Naveed Merchant , Jeffrey D. Hart

Linear Hypothesis Testing in Dense High-Dimensional Linear Models

We propose a methodology for testing linear hypothesis in high-dimensional linear models. The proposed test does not impose any restriction on the size of the model, i.e. model sparsity or the loading vector representing the hypothesis.…

Methodology · Statistics 2019-07-09 Yinchu Zhu , Jelena Bradic

Screening methods for linear errors-in-variables models in high dimensions

Microarray studies, in order to identify genes associated with an outcome of interest, usually produce noisy measurements for a large number of gene expression features from a small number of subjects. One common approach to analyzing such…

Methodology · Statistics 2021-04-21 Linh Nghiem , Francis K. C. Hui , Samuel Mueller , A. H. Welsh

Variable screening with multiple studies

Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…

Methodology · Statistics 2017-10-12 Tianzhou Ma , Zhao Ren , George C. Tseng

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Conditional nonparametric variable screening by neural factor regression

High-dimensional covariates often admit linear factor structure. To effectively screen correlated covariates in high-dimension, we propose a conditional variable screening test based on non-parametric regression using neural networks due to…

Econometrics · Economics 2024-08-21 Jianqing Fan , Weining Wang , Yue Zhao

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models

Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to…

Statistics Theory · Mathematics 2020-07-07 Talal Ahmed , Waheed U. Bajwa

Ultrahigh dimensional variable selection: beyond the linear model

Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking…

Methodology · Statistics 2008-12-18 Jianqing Fan , Richard Samworth , Yichao Wu

Large-scale Nonlinear Variable Selection via Kernel Random Features

We propose a new method for input variable selection in nonlinear regression. The method is embedded into a kernel regression machine that can model general nonlinear functions, not being a priori limited to additive models. This is the…

Machine Learning · Computer Science 2018-09-05 Magda Gregorová , Jason Ramapuram , Alexandros Kalousis , Stéphane Marchand-Maillet

High-dimensional variable selection

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression…

Statistics Theory · Mathematics 2009-08-20 Larry Wasserman , Kathryn Roeder

Regularization after retention in ultrahigh dimensional linear regression models

In ultrahigh dimensional setting, independence screening has been both theoretically and empirically proved a useful variable selection framework with low computation cost. In this work, we propose a two-step framework by using marginal…

Methodology · Statistics 2017-08-11 Haolei Weng , Yang Feng , Xingye Qiao

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models

A variable screening procedure via correlation learning was proposed Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To…

Methodology · Statistics 2011-01-19 Jianqing Fan , Yang Feng , Rui Song

A sure independence screening procedure for ultra-high dimensional partially linear additive models

We introduce a two-step procedure, in the context of ultra-high dimensional additive models, which aims to reduce the size of covariates vector and distinguish linear and nonlinear effects among nonzero components. Our proposed screening…

Statistics Theory · Mathematics 2017-08-30 M. Kazemi , D. Shahsavani , M. Arashi

Variable Selection for Comparing High-dimensional Time-Series Data

Given a pair of multivariate time-series data of the same length and dimensions, an approach is proposed to select variables and time intervals where the two series are significantly different. In applications where one time series is an…

Methodology · Statistics 2024-12-11 Kensuke Mitsuzawa , Margherita Grossi , Stefano Bortoli , Motonobu Kanagawa

Test for high-dimensional linear hypothesis of mean vectors via random integration

In this paper, we investigate hypothesis testing for the linear combination of mean vectors across multiple populations through the method of random integration. We have established the asymptotic distributions of the test statistics under…

Applications · Statistics 2024-03-13 Jianghao Li , Shizhe Hong , Zhenzhen Niu , Zhidong Bai

On Exact Feature Screening in Ultrahigh-dimensional Binary Classification

We propose a new model-free feature screening method based on energy distances for ultrahigh-dimensional binary classification problems. With a high probability, the proposed method retains only relevant features after discarding all the…

Methodology · Statistics 2023-05-19 Sarbojit Roy , Soham Sarkar , Subhajit Dutta , Anil K. Ghosh

Fast Cross-Validation via Sequential Testing

With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which…

Machine Learning · Computer Science 2016-02-05 Tammo Krueger , Danny Panknin , Mikio Braun

On Variable Screening in Multiple Nonparametric Regression Model

In this article, we study the problem of variable screening in multiple nonparametric regression model. The proposed methodology is based on the fact that the partial derivative of the regression function with respect to the irrelevant…

Methodology · Statistics 2021-01-19 Subhra Sankar Dhar , Prashant Jha , Aranyak Acharyya

A Transparent and Nonlinear Method for Variable Selection

Variable selection is a procedure to attain the truly important predictors from inputs. Complex nonlinear dependencies and strong coupling pose great challenges for variable selection in high-dimensional data. In addition, real-world…

Methodology · Statistics 2023-07-04 Keyao Wang , Huiwen Wang , Jichang Zhao , Lihong Wang