English
Related papers

Related papers: Cluster Stability Selection

200 papers

Estimation of structure, such as in variable selection, graphical modelling or cluster analysis is notoriously difficult, especially for high-dimensional data. We introduce stability selection. It is based on subsampling in combination with…

Methodology · Statistics 2009-05-16 Nicolai Meinshausen , Peter Buehlmann

In variable or graph selection problems, finding a right-sized model or controlling the number of false positives is notoriously difficult. Recently, a meta-algorithm called Stability Selection was proposed that can provide reliable…

Machine Learning · Statistics 2017-12-14 George Philipp , Seunghak Lee , Eric P. Xing

Stability Selection was recently introduced by Meinshausen and Buhlmann (2010) as a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection…

Statistics Theory · Mathematics 2016-04-27 Rajen D. Shah , Richard J. Samworth

The Lasso has been widely used as a method for variable selection, valued for its simplicity and empirical performance. However, Lasso's selection stability deteriorates in the presence of correlated predictors. Several approaches have been…

Methodology · Statistics 2025-11-05 Mahdi Nouraie , Houying Zhu , Samuel Muller

We study the problem of linear feature selection when features are highly correlated. Such settings pose two fundamental challenges. First, how should model similarity be defined? Simply counting features in common can be misleading: two…

Methodology · Statistics 2026-03-24 Xiaozhu Zhang , Jacob Bien , Armeen Taeb

A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the…

Machine Learning · Statistics 2010-07-08 Ulrike von Luxburg

The Lasso is a prominent algorithm for variable selection. However, its instability in the presence of correlated variables in the high-dimensional setting is well-documented. Although previous research has attempted to address this issue…

Methodology · Statistics 2025-05-28 Mahdi Nouraie , Connor Smith , Samuel Muller

We study feature selection in high-dimensional regression under two distinct sources of instability: sampling variability and measurement error in the design matrix. Stability Selection addresses the former through sub-sampling and…

Methodology · Statistics 2026-05-05 Mahdi Nouraie , Houying Zhu , Samuel Muller

In modern data analysis, sparse model selection becomes inevitable once the number of predictors variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The…

Machine Learning · Computer Science 2022-02-11 Tino Werner

Model selection is a major challenge in non-parametric clustering. There is no universally admitted way to evaluate clustering results for the obvious reason that no ground truth is available. The difficulty to find a universal evaluation…

Machine Learning · Computer Science 2023-05-18 Alex Mourer , Florent Forest , Mustapha Lebbah , Hanane Azzag , Jérôme Lacaille

Recently, many regularized procedures have been proposed for variable selection in linear regression, but their performance depends on the tuning parameter selection. Here a criterion for the tuning parameter selection is proposed, which…

Methodology · Statistics 2013-01-31 Yixin Fang , Junhui Wang , Wei Sun

We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and B\"uhlmann (J R Stat Soc 72:417-473, 2010). We propose to apply a base selection method repeatedly to random…

Methodology · Statistics 2016-10-26 Andre Beinrucker , Ürün Dogan , Gilles Blanchard

Meinshausen and Buhlmann [Ann. Statist. 34 (2006) 1436--1462] showed that, for neighborhood selection in Gaussian graphical models, under a neighborhood stability condition, the LASSO is consistent, even when the number of variables is of…

Statistics Theory · Mathematics 2008-08-08 Cun-Hui Zhang , Jian Huang

Feature selection, as a vital dimension reduction technique, reduces data dimension by identifying an essential subset of input features, which can facilitate interpretable insights into learning and inference processes. Algorithmic…

Machine Learning · Computer Science 2022-01-06 Xinxing Wu , Qiang Cheng

In many high dimensional classification or regression problems set in a biological context, the complete identification of the set of informative features is often as important as predictive accuracy, since this can provide mechanistic…

Machine Learning · Computer Science 2020-03-02 Yuxin Sun , Benny Chain , Samuel Kaski , John Shawe-Taylor

The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…

Physics and Society · Physics 2012-03-29 Andrea Lancichinetti , Santo Fortunato

The instability in the selection of models is a major concern with data sets containing a large number of covariates. We focus on stability selection which is used as a technique to improve variable selection performance for a range of…

Methodology · Statistics 2016-04-26 Md Hasinur Rahaman Khan , Anamika Bhadra , Tamanna Howlader

Deep clustering methods improve the performance of clustering tasks by jointly optimizing deep representation learning and clustering. While numerous deep clustering algorithms have been proposed, most of them rely on artificially…

Machine Learning · Computer Science 2024-01-30 Zhanwen Cheng , Feijiang Li , Jieting Wang , Yuhua Qian

In this article we investigate consistency of selection in regression models via the popular Lasso method. Here we depart from the traditional linear regression assumption and consider approximations of the regression function $f$ with…

Statistics Theory · Mathematics 2008-12-18 Florentina Bunea

Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical…

Statistics Theory · Mathematics 2013-10-02 Bin Yu
‹ Prev 1 2 3 10 Next ›