English
Related papers

Related papers: Distributed variable screening for generalized lin…

200 papers

In this paper we propose a linear variable screening method for computer experiments when the number of input variables is larger than the number of runs. This method uses a linear model to model the nonlinear data, and screens the…

Methodology · Statistics 2020-06-16 Chunya Li , Daijun Chen , Shifeng Xiong

In variable selection, most existing screening methods focus on marginal effects and ignore dependence between covariates. To improve the performance of selection, we incorporate pairwise effects in covariates for screening and…

Methodology · Statistics 2019-02-12 Siliang Gong , Kai Zhang , Yufeng Liu

We propose generalized additive partial linear models for complex data which allow one to capture nonlinear patterns of some covariates, in the presence of linear components. The proposed method improves estimation efficiency and increases…

Statistics Theory · Mathematics 2014-05-26 Li Wang , Lan Xue , Annie Qu , Hua Liang

High-dimensional covariates often admit linear factor structure. To effectively screen correlated covariates in high-dimension, we propose a conditional variable screening test based on non-parametric regression using neural networks due to…

Econometrics · Economics 2024-08-21 Jianqing Fan , Weining Wang , Yue Zhao

We examine the linear regression problem in a challenging high-dimensional setting with correlated predictors where the vector of coefficients can vary from sparse to dense. In this setting, we propose a combination of probabilistic…

Methodology · Statistics 2025-05-13 Roman Parzer , Peter Filzmoser , Laura Vana-Gür

Feature screening is a powerful tool in the analysis of high dimensional data. When the sample size $N$ and the number of features $p$ are both large, the implementation of classic screening methods can be numerically challenging. In this…

Methodology · Statistics 2019-03-12 Xingxiang Li , Runze Li , Zhiming Xia , Chen Xu

We propose a distributed method for simultaneous inference for datasets with sample size much larger than the number of covariates, i.e., N >> p, in the generalized linear models framework. When such datasets are too big to be analyzed…

Methodology · Statistics 2020-07-23 Lu Tang , Ling Zhou , Peter X. -K. Song

As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure…

Methodology · Statistics 2023-03-14 Sifan Liu , Snigdha Panigrahi

Distributed optimization has been widely used as one of the most efficient approaches for model training with massive samples. However, large-scale learning problems with both massive samples and high-dimensional features widely exist in…

Machine Learning · Computer Science 2022-04-26 Runxue Bao , Xidong Wu , Wenhan Xian , Heng Huang

Microarray studies, in order to identify genes associated with an outcome of interest, usually produce noisy measurements for a large number of gene expression features from a small number of subjects. One common approach to analyzing such…

Methodology · Statistics 2021-04-21 Linh Nghiem , Francis K. C. Hui , Samuel Mueller , A. H. Welsh

We propose a new variable selection procedure for a functional linear model with multiple scalar responses and multiple functional predictors. This method is based on basis expansions of the involved functional predictors and coefficients…

Statistics Theory · Mathematics 2023-11-03 Alban Mina Mbina , Guy Martial Nkiet

The focus of modern biomedical studies has gradually shifted to explanation and estimation of joint effects of high dimensional predictors on disease risks. Quantifying uncertainty in these estimates may provide valuable insight into…

Methodology · Statistics 2021-03-09 Zhe Fei , Yi Li

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…

Methodology · Statistics 2016-11-29 Haeran Cho , Piotr Fryzlewicz

In this article, we study the problem of variable screening in multiple nonparametric regression model. The proposed methodology is based on the fact that the partial derivative of the regression function with respect to the irrelevant…

Methodology · Statistics 2021-01-19 Subhra Sankar Dhar , Prashant Jha , Aranyak Acharyya

Linear mixed models are a versatile statistical tool to study data by accounting for fixed effects and random effects from multiple sources of variability. In many situations, a large number of candidate fixed effects is available and it is…

Methodology · Statistics 2022-09-09 Emanuele Degani , Luca Maestrini , Dorota Toczydłowska , Matt P. Wand

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu

Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design…

Methodology · Statistics 2018-02-13 Jonathan P Williams , Jan Hannig

In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…

Methodology · Statistics 2019-02-13 Yu-Hsiang Cheng , Tzee-Ming Huang , Su-Yun Huang

Feature or variable selection is a problem inherent to large data sets. While many methods have been proposed to deal with this problem, some can scale poorly with the number of predictors in a data set. Screening methods scale linearly…

Methodology · Statistics 2023-01-09 Naveed Merchant , Jeffrey D. Hart

The principal support vector machines method (Li et al., 2011) is a powerful tool for sufficient dimension reduction that replaces original predictors with their low-dimensional linear combinations without loss of information. However, the…

Machine Learning · Statistics 2019-12-02 Jun Jin , Chao Ying , Zhou Yu
‹ Prev 1 2 3 10 Next ›