Related papers: Stability Selection for Structured Variable Select…
Estimation of structure, such as in variable selection, graphical modelling or cluster analysis is notoriously difficult, especially for high-dimensional data. We introduce stability selection. It is based on subsampling in combination with…
Modern biotechnologies often result in high-dimensional data sets with much more variables than observations (n $\ll$ p). These data sets pose new challenges to statistical analysis: Variable selection becomes one of the most important…
Stability selection (Meinshausen and Buhlmann, 2010) makes any feature selection method more stable by returning only those features that are consistently selected across many subsamples. We prove (in what is, to our knowledge, the first…
The Lasso has been widely used as a method for variable selection, valued for its simplicity and empirical performance. However, Lasso's selection stability deteriorates in the presence of correlated predictors. Several approaches have been…
The instability in the selection of models is a major concern with data sets containing a large number of covariates. We focus on stability selection which is used as a technique to improve variable selection performance for a range of…
In modern data analysis, sparse model selection becomes inevitable once the number of predictors variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The…
Stability selection is a popular method for improving feature selection algorithms. One of its key attributes is that it provides theoretical upper bounds on the expected number of false positives, E(FP), enabling false positive control in…
Stability selection is a widely adopted resampling-based framework for high-dimensional variable selection. This paper seeks to broaden the use of an established stability estimator to evaluate the overall stability of the stability…
Recently, many regularized procedures have been proposed for variable selection in linear regression, but their performance depends on the tuning parameter selection. Here a criterion for the tuning parameter selection is proposed, which…
The Lasso is a prominent algorithm for variable selection. However, its instability in the presence of correlated variables in the high-dimensional setting is well-documented. Although previous research has attempted to address this issue…
Stability Selection was recently introduced by Meinshausen and Buhlmann (2010) as a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection…
Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model…
We study feature selection in high-dimensional regression under two distinct sources of instability: sampling variability and measurement error in the design matrix. Stability Selection addresses the former through sub-sampling and…
Stability selection represents an attractive approach to identify sparse sets of features jointly associated with an outcome in high-dimensional contexts. We introduce an automated calibration procedure via maximisation of an in-house…
When applying the support vector machine (SVM) to high-dimensional classification problems, we often impose a sparse structure in the SVM to eliminate the influences of the irrelevant predictors. The lasso and other variable selection…
In this paper, we consider voxel selection for functional Magnetic Resonance Imaging (fMRI) brain data with the aim of finding a more complete set of probably correlated discriminative voxels, thus improving interpretation of the discovered…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
The instability in the selection of models is a major concern with data sets containing a large number of covariates. This paper deals with variable selection methodology in the case of high-dimensional problems where the response variable…
In high-dimensions, many variable selection methods, such as the lasso, are often limited by excessive variability and rank deficiency of the sample covariance matrix. Covariance sparsity is a natural phenomenon in high-dimensional…
Contamination can severely distort an estimator unless the estimation procedure is suitably robust. This is a well-known issue and has been addressed in Robust Statistics, however, the relation of contamination and distorted variable…