English
Related papers

Related papers: High-Breakdown Robust Multivariate Methods

200 papers

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for…

Machine Learning · Statistics 2021-01-13 Peter J. Rousseeuw , Mia Hubert

Learning in the presence of outliers is a fundamental problem in statistics. Until recently, all known efficient unsupervised learning algorithms were very sensitive to outliers in high dimensions. In particular, even for the task of robust…

Data Structures and Algorithms · Computer Science 2019-11-15 Ilias Diakonikolas , Daniel M. Kane

Cellwise outliers are likely to occur together with casewise outliers in modern data sets with relatively large dimension. Recent work has shown that traditional robust regression methods may fail for data sets in this paradigm. The…

Statistics Theory · Mathematics 2016-12-28 Andy Leung , Hongyang Zhang , Ruben H. Zamar

This paper presents a fast methodology, called ROBOUT, to identify outliers in a response variable conditional on a set of linearly related predictors, retrieved from a large granular dataset. ROBOUT is shown to be effective and…

Methodology · Statistics 2021-04-27 Matteo Farnè , Angelos Vouldis

Nonparametric regression models offer a way to understand and quantify relationships between variables without having to identify an appropriate family of possible regression functions. Although many estimation methods for these models have…

Methodology · Statistics 2023-04-07 Matias Salibian-Barrera

This note investigates the problem of detecting outliers in longitudinal data. It compares well-known methods used in official statistics with proposals from the fields of data mining and machine learning that are based on the distance…

Methodology · Statistics 2025-07-30 Marcello D'Orazio

Outlier detection algorithms typically assign an outlier score to each observation in a dataset, indicating the degree to which an observation is an outlier. However, these scores are often not comparable across algorithms and can be…

Machine Learning · Computer Science 2024-10-31 Philipp Röchner , Henrique O. Marques , Ricardo J. G. B. Campello , Arthur Zimek , Franz Rothlauf

Multivariate location and scatter matrix estimation is a cornerstone in multivariate data analysis. We consider this problem when the data may contain independent cellwise and casewise outliers. Flat data sets with a large number of…

Statistics Theory · Mathematics 2014-06-24 Claudio Agostinelli , Andy Leung , Victor J. Yohai , Ruben H. Zamar

Classical discriminant analysis (DA) is based on the mean and empirical covariance matrix of each class, both of which are sensitive to outliers in the data. In the past the focus was on casewise outliers, that is, datapoints that lie far…

Methodology · Statistics 2026-05-29 Fabio Centofanti , Can Hakan Dagidir , Mia Hubert , Peter J. Rousseeuw

Machine learning and data analysis have been used in many robotics fields, especially for modelling. Data are usually the result of sensor measurements and, as such, they might be subjected to noise and outliers. The presence of outliers…

Robotics · Computer Science 2019-08-26 Francesco Cursi , Guang-Zhong Yang

A robust estimation framework for binary regression models is studied, aiming to extend traditional approaches like logistic regression models. While previous studies largely focused on logistic models, we explore a broader class of models…

Methodology · Statistics 2025-02-24 Kenichi Hayashi , Shinto Eguchi

Outlying observations can be challenging to handle and adversely affect subsequent analyses, especially in data with increasing dimensional complexity. Although outliers are not always undesired anomalies in the data and may possess…

Methodology · Statistics 2025-09-18 Anthony-Alexander Christidis , Gabriela Cohen-Freue

Most of the regularization methods such as the LASSO have one (or more) regularization parameter(s), and to select the value of the regularization parameter is essentially equal to select a model. Thus, to obtain a model suitable for the…

Methodology · Statistics 2025-11-07 Sumito Kurata , Kei Hirose

The sample covariance matrix is a cornerstone of multivariate statistics, but it is highly sensitive to outliers. These can be casewise outliers, such as cases belonging to a different population, or cellwise outliers, which are deviating…

Methodology · Statistics 2025-05-27 Fabio Centofanti , Mia Hubert , Peter J. Rousseeuw

Principal component regression uses principal components as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We…

Methodology · Statistics 2020-01-28 Philippe Gagnon , Mylène Bédard , Alain Desgagné

The inflated beta regression model is widely used for modeling continuous proportions with values at the boundaries. Maximum likelihood estimation for these models is well-known for its sensitivity to outliers, which can severely distort…

Methodology · Statistics 2026-05-15 Francisco Felipe Queiroz , Silvia Lopes de Paula Ferrari

A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced…

Quantitative Methods · Quantitative Biology 2014-03-17 Xiaobei Zhou , Helen Lindsay , Mark D. Robinson

The best subset selection (or "best subsets") estimator is a classic tool for sparse regression, and developments in mathematical optimization over the past decade have made it more computationally tractable than ever. Notwithstanding its…

Methodology · Statistics 2022-01-11 Ryan Thompson

In data analysis, contamination caused by outliers is inevitable, and robust statistical methods are strongly demanded. In this paper, our concern is to develop a new approach for robust data analysis based on scoring rules. The scoring…

Statistics Theory · Mathematics 2013-11-22 Takafumi Kanamori , Hironori Fujisawa

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the…

Statistics Theory · Mathematics 2020-12-15 Banghua Zhu , Jiantao Jiao , Jacob Steinhardt
‹ Prev 1 2 3 10 Next ›