Related papers: High-dimensional variable selection

Optimal Two-Step Prediction in Regression

High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning…

Methodology · Statistics 2017-06-07 Didier Chételat , Johannes Lederer , Joseph Salmon

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Comparison study of variable selection procedures in high-dimensional Gaussian linear regression

We propose an extensive simulation study to compare some variable selection procedures in a high-dimensional framework. Assuming that the relationship between the actives variables and the response variable is linear, the high-dimensional…

Applications · Statistics 2025-03-21 Perrine Lacroix , Mélina Gallopin , Marie-Laure Martin

"Pre-conditioning" for feature selection and regression in high-dimensional problems

We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response…

Statistics Theory · Mathematics 2013-04-16 Debashis Paul , Eric Bair , Trevor Hastie , Robert Tibshirani

Faithful Variable Screening for High-Dimensional Convex Regression

We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the…

Statistics Theory · Mathematics 2014-11-19 Min Xu , Minhua Chen , John Lafferty

The Loss Rank Criterion for Variable Selection in Linear Regression Analysis

Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model…

Methodology · Statistics 2014-02-26 Minh-Ngoc Tran

Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…

Methodology · Statistics 2018-06-19 X. Jessie Jeng , Huimin Peng , Wenbin Lu

High-dimensional variable selection via tilting

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…

Methodology · Statistics 2016-11-29 Haeran Cho , Piotr Fryzlewicz

Two-Stage Testing in a high dimensional setting

In a high dimensional regression setting in which the number of variables ($p$) is much larger than the sample size ($n$), the number of possible two-way interactions between the variables is immense. If the number of variables is in the…

Methodology · Statistics 2024-06-26 Marianne A Jonker , Luc van Schijndel , Eric Cator

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which…

Methodology · Statistics 2021-07-09 Kaixu Yang , Tapabrata Maiti

A review and recommendations on variable selection methods in regression models for binary data

The selection of essential variables in logistic regression is vital because of its extensive use in medical studies, finance, economics and related fields. In this paper, we explore four main typologies (test-based, penalty-based,…

Methodology · Statistics 2022-05-17 Souvik Bag , Kapil Gupta , Soudeep Deb

Exact Multivariate Tests - A New Effective Principle of Controlled Model Choice

High-dimensional tests are applied to find relevant sets of variables and relevant models. If variables are selected by analyzing the sums of products matrices and a corresponding mean-value test is performed, there is the danger that the…

Methodology · Statistics 2012-02-10 Juergen Laeuter , Maciej Rosolowski , Ekkehard Glimm

Some Two-Step Procedures for Variable Selection in High-Dimensional Linear Regression

We study the problem of high-dimensional variable selection via some two-step procedures. First we show that given some good initial estimator which is $\ell_{\infty}$-consistent but not necessarily variable selection consistent, we can…

Statistics Theory · Mathematics 2008-10-10 Jian Zhang , Xinge Jessie Jeng , Han Liu

Establishment and Solution of a Multi-Stage Decision Model Based on Hypothesis Testing and Dynamic Programming Algorithm

This paper introduces a novel multi-stage decision-making model that integrates hypothesis testing and dynamic programming algorithms to address complex decision-making scenarios.Initially,we develop a sampling inspection scheme that…

Systems and Control · Electrical Eng. & Systems 2025-03-11 Ziyang Liu , Yurui Hu , Yihan Deng

Controlling false discoveries in high-dimensional situations: Boosting with stability selection

Modern biotechnologies often result in high-dimensional data sets with much more variables than observations (n $\ll$ p). These data sets pose new challenges to statistical analysis: Variable selection becomes one of the most important…

Machine Learning · Statistics 2014-11-06 Benjamin Hofner , Luigi Boccuto , Markus Göker

Stability Selection

Estimation of structure, such as in variable selection, graphical modelling or cluster analysis is notoriously difficult, especially for high-dimensional data. We introduce stability selection. It is based on subsampling in combination with…

Methodology · Statistics 2009-05-16 Nicolai Meinshausen , Peter Buehlmann

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well-developed, the relative efficacy of different approaches in finite-sample…

Methodology · Statistics 2020-01-29 Fan Wang , Sach Mukherjee , Sylvia Richardson , Steven M. Hill

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu

Regularization after retention in ultrahigh dimensional linear regression models

In ultrahigh dimensional setting, independence screening has been both theoretically and empirically proved a useful variable selection framework with low computation cost. In this work, we propose a two-step framework by using marginal…

Methodology · Statistics 2017-08-11 Haolei Weng , Yang Feng , Xingye Qiao

Two-Stage Robust and Sparse Distributed Statistical Inference for Large-Scale Data

In this paper, we address the problem of conducting statistical inference in settings involving large-scale data that may be high-dimensional and contaminated by outliers. The high volume and dimensionality of the data require distributed…

Machine Learning · Statistics 2022-11-30 Emadaldin Mozafari-Majd , Visa Koivunen