Related papers: Efficient Test-based Variable Selection for High-d…

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu

High-dimensional variable selection

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression…

Statistics Theory · Mathematics 2009-08-20 Larry Wasserman , Kathryn Roeder

Cross-Validation for Unsupervised Learning

Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised…

Methodology · Statistics 2009-09-17 Patrick O. Perry

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which…

Methodology · Statistics 2021-07-09 Kaixu Yang , Tapabrata Maiti

A Transparent and Nonlinear Method for Variable Selection

Variable selection is a procedure to attain the truly important predictors from inputs. Complex nonlinear dependencies and strong coupling pose great challenges for variable selection in high-dimensional data. In addition, real-world…

Methodology · Statistics 2023-07-04 Keyao Wang , Huiwen Wang , Jichang Zhao , Lihong Wang

Efficient kernel-based variable selection with sparsistency

Variable selection is central to high-dimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms…

Machine Learning · Statistics 2021-02-04 Xin He , Junhui Wang , Shaogao Lv

Forward variable selection for sparse ultra-high dimensional varying coefficient models

Varying coefficient models have numerous applications in a wide scope of scientific areas. While enjoying nice interpretability, they also allow flexibility in modeling dynamic impacts of the covariates. But, in the new era of big data, it…

Methodology · Statistics 2014-10-27 Ming-Yen Cheng , Toshio Honda , Jin-Ting Zhang

Comprehensive Stepwise Selection for Logistic Regression

Automated variable selection is widely applied in statistical model development. Algorithms like forward, backward or stepwise selection are available in statistical software packages like R and SAS. Many researchers have criticized the use…

Methodology · Statistics 2023-06-19 Bernd Engelmann

Ultrahigh dimensional variable selection: beyond the linear model

Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking…

Methodology · Statistics 2008-12-18 Jianqing Fan , Richard Samworth , Yichao Wu

The restricted consistency property of leave-$n_v$-out cross-validation for high-dimensional variable selection

Cross-validation (CV) methods are popular for selecting the tuning parameter in the high-dimensional variable selection problem. We show the mis-alignment of the CV is one possible reason of its over-selection behavior. To fix this issue,…

Methodology · Statistics 2018-01-17 Yang Feng , Yi Yu

Optimal Two-Step Prediction in Regression

High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning…

Methodology · Statistics 2017-06-07 Didier Chételat , Johannes Lederer , Joseph Salmon

Conditional Variable Selection for Intelligent Test

Intelligent test requires efficient and effective analysis of high-dimensional data in a large scale. Traditionally, the analysis is often conducted by human experts, but it is not scalable in the era of big data. To tackle this challenge,…

Machine Learning · Computer Science 2022-07-04 Yiwen Liao , Tianjie Ge , Raphaël Latty , Bin Yang

High-dimensional variable selection via tilting

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…

Methodology · Statistics 2016-11-29 Haeran Cho , Piotr Fryzlewicz

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the…

Machine Learning · Computer Science 2020-12-29 Weikai Li , Chuanxing Geng , Songcan Chen

Cross-validatory model selection for Bayesian autoregressions with exogenous regressors

Bayesian cross-validation (CV) is a popular method for predictive model assessment that is simple to implement and broadly applicable. A wide range of CV schemes is available for time series applications, including generic leave-one-out…

Methodology · Statistics 2023-10-12 Alex Cooper , Dan Simpson , Lauren Kennedy , Catherine Forbes , Aki Vehtari

Model selection by cross-validation in an expectile linear regression

For linear models that may have asymmetric errors, we study variable selection by cross-validation. The data are split into training and validation sets, with the number of observations in the validation set much larger than in the training…

Methodology · Statistics 2026-01-16 Bilel Bousselmi , Gabriela Ciuperca

A review and recommendations on variable selection methods in regression models for binary data

The selection of essential variables in logistic regression is vital because of its extensive use in medical studies, finance, economics and related fields. In this paper, we explore four main typologies (test-based, penalty-based,…

Methodology · Statistics 2022-05-17 Souvik Bag , Kapil Gupta , Soudeep Deb

Linear screening for high-dimensional computer experiments

In this paper we propose a linear variable screening method for computer experiments when the number of input variables is larger than the number of runs. This method uses a linear model to model the nonlinear data, and screens the…

Methodology · Statistics 2020-06-16 Chunya Li , Daijun Chen , Shifeng Xiong

Faithful Variable Screening for High-Dimensional Convex Regression

We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the…

Statistics Theory · Mathematics 2014-11-19 Min Xu , Minhua Chen , John Lafferty

Variable Selection for Clustering and Classification

As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…

Computation · Statistics 2013-03-22 Jeffrey L. Andrews , Paul D. McNicholas