English
Related papers

Related papers: SuRF: a New Method for Sparse Variable Selection, …

200 papers

Feature selection that selects an informative subset of variables from data not only enhances the model interpretability and performance but also alleviates the resource demands. Recently, there has been growing attention on feature…

Neural and Evolutionary Computing · Computer Science 2023-03-15 Zahra Atashgahi , Xuhao Zhang , Neil Kichler , Shiwei Liu , Lu Yin , Mykola Pechenizkiy , Raymond Veldhuis , Decebal Constantin Mocanu

Within the statistical and machine learning literature, regularization techniques are often used to construct sparse (predictive) models. Most regularization strategies only work for data where all predictors are treated identically, such…

Computation · Statistics 2020-12-16 Sander Devriendt , Katrien Antonio , Tom Reynkens , Roel Verbelen

This paper proposes a sparse regression method that continuously interpolates between Forward Stepwise selection (FS) and the LASSO. When tuned appropriately, our solutions are much sparser than typical LASSO fits but, unlike FS fits,…

Methodology · Statistics 2024-11-20 Ivy Zhang , Robert Tibshirani

Compositional data, where only relative abundances are available, are common in microbiome and other high-throughput sequencing studies. Log ratios between groups of variables serve as key biomarkers in these settings. However, selecting…

Methodology · Statistics 2025-04-02 Jing Ma , Paizhe Xie , Kristyn Pantoja , David E. Jones

We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response…

Statistics Theory · Mathematics 2013-04-16 Debashis Paul , Eric Bair , Trevor Hastie , Robert Tibshirani

Sparse Representation (SR) techniques encode the test samples into a sparse linear combination of all training samples and then classify the test samples into the class with the minimum residual. The classification of SR techniques depends…

Computer Vision and Pattern Recognition · Computer Science 2019-07-01 Chun-Mei Feng , Yong Xu , Zuoyong Li , Jian Yang

Methods for global measurement of transcript abundance such as microarrays and RNA-Seq generate datasets in which the number of measured features far exceeds the number of observations. Extracting biologically meaningful and experimentally…

Methodology · Statistics 2022-06-22 Lei Ding , Gabriel E. Zentner , Daniel J. McDonald

In high-dimensions, many variable selection methods, such as the lasso, are often limited by excessive variability and rank deficiency of the sample covariance matrix. Covariance sparsity is a natural phenomenon in high-dimensional…

Methodology · Statistics 2010-06-08 X. Jessie Jeng And Z. John Daye

High-dimensional learning problems, where the number of features exceeds the sample size, often require sparse regularization for effective prediction and variable selection. While established for fully supervised data, these techniques…

Machine Learning · Computer Science 2026-01-01 The Tien Mai , Mai Anh Nguyen , Trung Nghia Nguyen

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…

Methodology · Statistics 2018-06-19 X. Jessie Jeng , Huimin Peng , Wenbin Lu

Sparsity-inducing penalties are useful tools for variable selection and they are also effective for regression settings where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in…

Methodology · Statistics 2020-06-01 Hidetoshi Matsui

The sparse-group lasso performs both variable and group selection, simultaneously using the strengths of the lasso and group lasso. It has found widespread use in genetics, a field that regularly involves the analysis of high-dimensional…

Machine Learning · Statistics 2025-09-18 Fabio Feser , Marina Evangelou

Varying coefficient models have numerous applications in a wide scope of scientific areas. While enjoying nice interpretability, they also allow flexibility in modeling dynamic impacts of the covariates. But, in the new era of big data, it…

Methodology · Statistics 2014-10-27 Ming-Yen Cheng , Toshio Honda , Jin-Ting Zhang

We propose a ranking and selection procedure to prioritize relevant predictors and control false discovery proportion (FDP) of variable selection. Our procedure utilizes a new ranking method built upon the de-sparsified Lasso estimator. We…

Methodology · Statistics 2018-12-12 X. Jessie Jeng , Xiongzhi Chen

In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with high-dimensional genomic and other omics data, a problem that can be…

Methodology · Statistics 2021-12-02 Zhi Zhao , Marco Banterle , Leonardo Bottolo , Sylvia Richardson , Alex Lewin , Manuela Zucknick

We develop a Bayesian methodology aimed at simultaneously estimating low-rank and row-sparse matrices in a high-dimensional multiple-response linear regression model. We consider a carefully devised shrinkage prior on the matrix of…

Methodology · Statistics 2019-04-10 Antik Chakraborty , Anirban Bhattacharya , Bani K. Mallick

Sparse recovery and subset selection are fundamental problems in varied communities, including signal processing, statistics and machine learning. Herein, we focus on an important greedy algorithm for these problems: Backward Stepwise…

Optimization and Control · Mathematics 2021-06-08 Sebatian Ament , Carla Gomes

This manuscript presents the following: (1) an improved version of the Binary Simultaneous Perturbation Stochastic Approximation (SPSA) Method for feature selection in machine learning (Aksakalli and Malekipirbazari, Pattern Recognition…

Feature selection from a large number of covariates (aka features) in a regression analysis remains a challenge in data science, especially in terms of its potential of scaling to ever-enlarging data and finding a group of scientifically…

Machine Learning · Statistics 2020-02-10 Yiying Fan , Jiayang Sun

An important problem in the analysis of high-dimensional omics data is to identify subsets of molecular variables that are associated with a phenotype of interest. This requires addressing the challenges of high dimensionality, strong…

Methodology · Statistics 2022-04-05 Fan Wang , Sylvia Richardson , Steven M. Hill
‹ Prev 1 2 3 10 Next ›