Related papers: High-dimensional regression and variable selection…

A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies

Background: Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that need to be taken into account.…

Applications · Statistics 2012-11-02 Verena Zuber , A. Pedro Duarte Silva , Korbinian Strimmer

Correlation-Adjusted Regression Survival Scores for High-Dimensional Variable Selection

Background: The development of classification methods for personalized medicine is highly dependent on the identification of predictive genetic markers. In survival analysis it is often necessary to discriminate between influential and…

Methodology · Statistics 2018-02-27 Thomas Welchowski , Verena Zuber , Matthias Schmid

High-dimensional variable selection via tilting

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…

Methodology · Statistics 2016-11-29 Haeran Cho , Piotr Fryzlewicz

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

A variable selection approach for highly correlated predictors in high-dimensional genomic data

In genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models.…

Methodology · Statistics 2020-07-22 Wencan Zhu , Céline Lévy-Leduc , Nils Ternès

Inference for covariate adjusted regression via varying coefficient models

We consider covariate adjusted regression (CAR), a regression method for situations where predictors and response are observed after being distorted by a multiplicative factor. The distorting factors are unknown functions of an observable…

Statistics Theory · Mathematics 2016-08-16 Damla Şentürk , Hans-Georg Müller

Optimal Feature Selection in High-Dimensional Discriminant Analysis

We consider the high-dimensional discriminant analysis problem. For this problem, different methods have been proposed and justified by establishing exact convergence rates for the classification risk, as well as the l2 convergence results…

Machine Learning · Statistics 2013-06-28 Mladen Kolar , Han Liu

Cross-Leverage Scores for Selecting Subsets of Explanatory Variables

In a standard regression problem, we have a set of explanatory variables whose effect on some response vector is modeled. For wide binary data, such as genetic marker data, we often have two limitations. First, we have more parameters than…

Methodology · Statistics 2021-09-20 Katharina Parry , Leo N. Geppert , Alexander Munteanu , Katja Ickstadt

High-dimensional variable selection

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression…

Statistics Theory · Mathematics 2009-08-20 Larry Wasserman , Kathryn Roeder

High dimensional VAR with low rank transition

We propose a vector auto-regressive (VAR) model with a low-rank constraint on the transition matrix. This new model is well suited to predict high-dimensional series that are highly correlated, or that are driven by a small number of hidden…

Statistics Theory · Mathematics 2022-01-17 Pierre Alquier , Karine Bertin , Paul Doukhan , Rémy Garnier

Prediction of multivariate responses with a select number of principal components

This paper proposes a new method and algorithm for predicting multivariate responses in a regression setting. Research into classification of High Dimension Low Sample Size (HDLSS) data, in particular microarray data, has made considerable…

Methodology · Statistics 2008-07-28 Inge Koch , Kanta Naito

Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…

Methodology · Statistics 2018-06-19 X. Jessie Jeng , Huimin Peng , Wenbin Lu

Variable selection in balance regression with applications to microbiome compositional data

Compositional data, where only relative abundances are available, are common in microbiome and other high-throughput sequencing studies. Log ratios between groups of variables serve as key biomarkers in these settings. However, selecting…

Methodology · Statistics 2025-04-02 Jing Ma , Paizhe Xie , Kristyn Pantoja , David E. Jones

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which…

Methodology · Statistics 2021-07-09 Kaixu Yang , Tapabrata Maiti

varbvs: Fast Variable Selection for Large-scale Regression

We introduce varbvs, a suite of functions written in R and MATLAB for regression analysis of large-scale data sets using Bayesian variable selection methods. We have developed numerical optimization algorithms based on variational…

Computation · Statistics 2017-09-21 Peter Carbonetto , Xiang Zhou , Matthew Stephens

Sequential Advantage Selection for Optimal Treatment Regimes

Variable selection for optimal treatment regime in a clinical trial or an observational study is getting more attention. Most existing variable selection techniques focused on selecting variables that are important for prediction, therefore…

Methodology · Statistics 2014-05-22 Ailin Fan , Wenbin Lu , Rui Song

CAM: Causal additive models, high-dimensional order search and penalized regression

We develop estimation for potentially high-dimensional additive structural equation models. A key component of our approach is to decouple order search among the variables from feature or edge selection in a directed acyclic graph encoding…

Methodology · Statistics 2014-12-02 Peter Bühlmann , Jonas Peters , Jan Ernest

Homogeneity in Regression

This paper explores the homogeneity of coefficients in high-dimensional regression, which extends the sparsity concept and is more general and suitable for many applications. Homogeneity arises when one expects regression coefficients…

Methodology · Statistics 2013-04-01 Tracy Ke , Jianqing Fan , Yichao Wu

Efficient kernel-based variable selection with sparsistency

Variable selection is central to high-dimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms…

Machine Learning · Statistics 2021-02-04 Xin He , Junhui Wang , Shaogao Lv

High-dimensional Log-Error-in-Variable Regression with Applications to Microbial Compositional Data Analysis

In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the…

Methodology · Statistics 2021-03-11 Pixu Shi , Yuchen Zhou , Anru R. Zhang