Related papers: High-Breakdown Robust Multivariate Methods

Anomaly Detection by Robust Statistics

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for…

Machine Learning · Statistics 2021-01-13 Peter J. Rousseeuw , Mia Hubert

Recent Advances in Algorithmic High-Dimensional Robust Statistics

Learning in the presence of outliers is a fundamental problem in statistics. Until recently, all known efficient unsupervised learning algorithms were very sensitive to outliers in high dimensions. In particular, even for the task of robust…

Data Structures and Algorithms · Computer Science 2019-11-15 Ilias Diakonikolas , Daniel M. Kane

Robust regression estimation and inference in the presence of cellwise and casewise contamination

Cellwise outliers are likely to occur together with casewise outliers in modern data sets with relatively large dimension. Recent work has shown that traditional robust regression methods may fail for data sets in this paradigm. The…

Statistics Theory · Mathematics 2016-12-28 Andy Leung , Hongyang Zhang , Ruben H. Zamar

Robust selection of predictors and conditional outlier detection in a perturbed large-dimensional regression context

This paper presents a fast methodology, called ROBOUT, to identify outliers in a response variable conditional on a set of linearly related predictors, retrieved from a large granular dataset. ROBOUT is shown to be effective and…

Methodology · Statistics 2021-04-27 Matteo Farnè , Angelos Vouldis

Robust nonparametric regression: review and practical considerations

Nonparametric regression models offer a way to understand and quantify relationships between variables without having to identify an appropriate family of possible regression functions. Although many estimation methods for these models have…

Methodology · Statistics 2023-04-07 Matias Salibian-Barrera

An empirical comparison of some outlier detection methods with longitudinal data

This note investigates the problem of detecting outliers in longitudinal data. It compares well-known methods used in official statistics with proposals from the fields of data mining and machine learning that are based on the distance…

Methodology · Statistics 2025-07-30 Marcello D'Orazio

Robust Statistical Scaling of Outlier Scores: Improving the Quality of Outlier Probabilities for Outliers (Extended Version)

Outlier detection algorithms typically assign an outlier score to each observation in a dataset, indicating the degree to which an observation is an outlier. However, these scores are often not comparable across algorithms and can be…

Machine Learning · Computer Science 2024-10-31 Philipp Röchner , Henrique O. Marques , Ricardo J. G. B. Campello , Arthur Zimek , Franz Rothlauf

Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination

Multivariate location and scatter matrix estimation is a cornerstone in multivariate data analysis. We consider this problem when the data may contain independent cellwise and casewise outliers. Flat data sets with a large number of…

Statistics Theory · Mathematics 2014-06-24 Claudio Agostinelli , Andy Leung , Victor J. Yohai , Ruben H. Zamar

Cellwise Robust Discriminant Analysis

Classical discriminant analysis (DA) is based on the mean and empirical covariance matrix of each class, both of which are sensitive to outliers in the data. In the past the focus was on casewise outliers, that is, datapoints that lie far…

Methodology · Statistics 2026-05-29 Fabio Centofanti , Can Hakan Dagidir , Mia Hubert , Peter J. Rousseeuw

A Robust Regression Approach for Robot Model Learning

Machine learning and data analysis have been used in many robotics fields, especially for modelling. Data are usually the result of sensor measurements and, as such, they might be subjected to noise and outliers. The presence of outliers…

Robotics · Computer Science 2019-08-26 Francesco Cursi , Guang-Zhong Yang

On a class of binary regression models and their robust estimation

A robust estimation framework for binary regression models is studied, aiming to extend traditional approaches like logistic regression models. While previous studies largely focused on logistic models, we explore a broader class of models…

Methodology · Statistics 2025-02-24 Kenichi Hayashi , Shinto Eguchi

Robust Multi-Model Subset Selection

Outlying observations can be challenging to handle and adversely affect subsequent analyses, especially in data with increasing dimensional complexity. Although outliers are not always undesired anomalies in the data and may possess…

Methodology · Statistics 2025-09-18 Anthony-Alexander Christidis , Gabriela Cohen-Freue

Robust and consistent model evaluation criteria in high-dimensional regression

Most of the regularization methods such as the LASSO have one (or more) regularization parameter(s), and to select the value of the regularization parameter is essentially equal to select a model. Thus, to obtain a model suitable for the…

Methodology · Statistics 2025-11-07 Sumito Kurata , Kei Hirose

Cellwise and Casewise Robust Covariance in High Dimensions

The sample covariance matrix is a cornerstone of multivariate statistics, but it is highly sensitive to outliers. These can be casewise outliers, such as cases belonging to a different population, or cellwise outliers, which are deviating…

Methodology · Statistics 2025-05-27 Fabio Centofanti , Mia Hubert , Peter J. Rousseeuw

An automatic robust Bayesian approach to principal component regression

Principal component regression uses principal components as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We…

Methodology · Statistics 2020-01-28 Philippe Gagnon , Mylène Bédard , Alain Desgagné

Robust inference in inflated beta regression

The inflated beta regression model is widely used for modeling continuous proportions with values at the boundaries. Maximum likelihood estimation for these models is well-known for its sensitivity to outliers, which can severely distort…

Methodology · Statistics 2026-05-15 Francisco Felipe Queiroz , Silvia Lopes de Paula Ferrari

Robustly detecting differential expression in RNA sequencing data using observation weights

A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced…

Quantitative Methods · Quantitative Biology 2014-03-17 Xiaobei Zhou , Helen Lindsay , Mark D. Robinson

Robust subset selection

The best subset selection (or "best subsets") estimator is a classic tool for sparse regression, and developments in mathematical optimization over the past decade have made it more computationally tractable than ever. Notwithstanding its…

Methodology · Statistics 2022-01-11 Ryan Thompson

Robust Estimation under Heavy Contamination using Enlarged Models

In data analysis, contamination caused by outliers is inevitable, and robust statistical methods are strongly demanded. In this paper, our concern is to develop a new approach for robust data analysis based on scoring rules. The scoring…

Statistics Theory · Mathematics 2013-11-22 Takafumi Kanamori , Hironori Fujisawa

Generalized Resilience and Robust Statistics

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the…

Statistics Theory · Mathematics 2020-12-15 Banghua Zhu , Jiantao Jiao , Jacob Steinhardt