English
Related papers

Related papers: Robust subset selection

200 papers

Outlying observations can be challenging to handle and adversely affect subsequent analyses, especially in data with increasing dimensional complexity. Although outliers are not always undesired anomalies in the data and may possess…

Methodology · Statistics 2025-09-18 Anthony-Alexander Christidis , Gabriela Cohen-Freue

Fully robust versions of the elastic net estimator are introduced for linear and logistic regression. The algorithms to compute the estimators are based on the idea of repeatedly applying the non-robust classical estimators to data subsets…

Methodology · Statistics 2017-03-16 Fatma Sevinc Kurnaz , Irene Hoffmann , Peter Filzmoser

The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to…

Machine Learning · Statistics 2025-09-08 Gautam Kamath

A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough…

Methodology · Statistics 2020-02-07 Elisa Cabana , Rosa E. Lillo , Henry Laniado

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We…

Methodology · Statistics 2013-03-20 Shifeng Xiong

The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is…

Applications · Statistics 2022-12-08 Aditya Deshmukh , Jing Liu , Venugopal V. Veeravalli

In high-dimensional statistics, variable selection recovers the latent sparse patterns from all possible covariate combinations. This paper proposes a novel optimization method to solve the exact L0-regularized regression problem, which is…

Methodology · Statistics 2022-06-02 Mingzhang Yin , Nhat Ho , Bowei Yan , Xiaoning Qian , Mingyuan Zhou

Cellwise outliers are likely to occur together with casewise outliers in modern data sets with relatively large dimension. Recent work has shown that traditional robust regression methods may fail for data sets in this paradigm. The…

Statistics Theory · Mathematics 2016-12-28 Andy Leung , Hongyang Zhang , Ruben H. Zamar

A robust estimator for a wide family of mixtures of linear regression is presented. Robustness is based on the joint adoption of the Cluster Weighted Model and of an estimator based on trimming and restrictions. The selected model provides…

Methodology · Statistics 2015-02-05 L. A. Garcia-Escudero , A. Gordaliza , F. Greselin , S. Ingrassia , A. Mayo-Iscar

Contamination can severely distort an estimator unless the estimation procedure is suitably robust. This is a well-known issue and has been addressed in Robust Statistics, however, the relation of contamination and distorted variable…

Statistics Theory · Mathematics 2022-07-15 Tino Werner

Sparse methods are the standard approach to obtain interpretable models with high prediction accuracy. Alternatively, algorithmic ensemble methods can achieve higher prediction accuracy at the cost of loss of interpretability. However, the…

Methodology · Statistics 2022-01-11 Anthony Christidis , Stefan Van Aelst , Ruben Zamar

Algorithmic robust statistics has traditionally focused on the contamination model where a small fraction of the samples are arbitrarily corrupted. We consider a recent contamination model that combines two kinds of corruptions: (i) small…

Data Structures and Algorithms · Computer Science 2024-10-23 Thanasis Pittas , Ankit Pensia

We study the optimal sample complexity of variable selection in linear regression under general design covariance, and show that subset selection is optimal while under standard complexity assumptions, efficient algorithms for this problem…

Statistics Theory · Mathematics 2025-10-07 Ming Gao , Bryon Aragam

We study stochastic programs where the decision-maker cannot observe the distribution of the exogenous uncertainties but has access to a finite set of independent samples from this distribution. In this setting, the goal is to find a…

Optimization and Control · Mathematics 2019-12-24 Bart P. G. Van Parys , Peyman Mohajerin Esfahani , Daniel Kuhn

We study the problem of selecting limited features to observe such that models trained on them can perform well simultaneously across multiple subpopulations. This problem has applications in settings where collecting each feature is…

Machine Learning · Computer Science 2025-10-27 Maitreyi Swaroop , Tamar Krishnamurti , Bryan Wilder

In today's era of big data, robust least-squares regression becomes a more challenging problem when considering the adversarial corruption along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer…

Data Structures and Algorithms · Computer Science 2017-10-04 Xuchao Zhang , Liang Zhao , Arnold P. Boedihardjo , Chang-Tien Lu

Many modern datasets are collected automatically and are thus easily contaminated by outliers. This led to a regain of interest in robust estimation, including new notions of robustness such as robustness to adversarial contamination of the…

Statistics Theory · Mathematics 2023-05-05 Pierre Alquier , Mathieu Gerber

The problem of identifying the most discriminating features when performing supervised learning has been extensively investigated. In particular, several methods for variable selection in model-based classification have been proposed.…

Applications · Statistics 2020-12-16 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Robust statistical estimators offer resilience against outliers but are often computationally challenging, particularly in high-dimensional sparse settings. Modern optimization techniques are utilized for robust sparse association…

Computation · Statistics 2025-02-03 Pia Pfeiffer , Andreas Alfons , Peter Filzmoser
‹ Prev 1 2 3 10 Next ›