Related papers: Fast robust correlation for high-dimensional data
The sample covariance matrix is a cornerstone of multivariate statistics, but it is highly sensitive to outliers. These can be casewise outliers, such as cases belonging to a different population, or cellwise outliers, which are deviating…
When applying a statistical method in practice it often occurs that some observations deviate from the usual assumptions. However, many classical methods are sensitive to outliers. The goal of robust statistics is to develop methods that…
Robust estimation of the covariance matrix and detection of outliers remain major challenges in statistical data analysis, particularly when the proportion of contaminated observations increases with the size of the dataset. Outliers can…
Large datasets are often affected by cell-wise outliers in the form of missing or erroneous data. However, discarding any samples containing outliers may result in a dataset that is too small to accurately estimate the covariance matrix.…
The geometric median covariation matrix is a robust multivariate indicator of dispersion which can be extended without any difficulty to functional data. We define estimators, based on recursive algorithms, that can be simply updated at…
Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for…
Reliable outlier detection in high-dimensional data is crucial in modern science, yet it remains a challenging task. Traditional methods often break down in these settings due to their reliance on asymptotic behaviors with respect to sample…
Principal Component Analysis (PCA) finds a linear mapping and maximizes the variance of the data which makes PCA sensitive to outliers and may cause wrong eigendirection. In this paper, we propose techniques to solve this problem; we use…
This work addresses the challenges of robust covariance estimation and interpretable outlier detection for multivariate functional data with separable covariance structure. We develop a method that simultaneously improves robustness and…
Cellwise outliers are likely to occur together with casewise outliers in modern data sets with relatively large dimension. Recent work has shown that traditional robust regression methods may fail for data sets in this paradigm. The…
Cellwise outliers are widespread in data and traditional robust methods may fail when applied to datasets under such contamination. We propose a variable selection procedure, that uses a pairwise robust estimator to obtain an initial…
Quadratic and Linear Discriminant Analysis (QDA/LDA) are the most often applied classification rules under normality. In QDA, a separate covariance matrix is estimated for each group. If there are more variables than observations in the…
There exist multiple methods to detect outliers in multivariate data in the literature, but most of them require to estimate the covariance matrix. The higher the dimension, the more complex the estimation of the matrix becoming impossible…
Outlier detection has gained increasing interest in recent years, due to newly emerging technologies and the huge amount of high-dimensional data that are now available. Outlier detection can help practitioners to identify unwanted noise…
Multivariate linear regression is a fundamental statistical task, but classical estimators such as ordinary least squares are highly sensitive to outliers. These may occur as casewise outliers that affect entire observations, or as outlying…
One of the significant challenges in monitoring the quality of products today is the high dimensionality of quality characteristics. In this paper, we address Phase I analysis of high-dimensional processes with individual observations when…
Learning in the presence of outliers is a fundamental problem in statistics. Until recently, all known efficient unsupervised learning algorithms were very sensitive to outliers in high dimensions. In particular, even for the task of robust…
The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is…
This paper proposes an adaptive penalized weighted mean regression for outlier detection of high-dimensional data. In comparison to existing approaches based on the mean shift model, the proposed estimators demonstrate robustness against…
Detecting changes in high-dimensional vectors presents significant challenges, especially when the post-change distribution is unknown and time-varying. This paper introduces a novel robust algorithm for correlation change detection in…