Related papers: Detecting deviating data cells

Challenges of cellwise outliers

It is well-known that real data often contain outliers. The term outlier typically refers to a case, that is, a row of the $n \times d$ data matrix. In recent times a different type has come into focus, the cellwise outliers. These are…

Methodology · Statistics 2024-07-08 Jakob Raymaekers , Peter J. Rousseeuw

Cellwise Outliers

In statistics and machine learning, the traditional meaning of the terms `outlier' and `anomaly' is a case in the dataset that behaves differently from the bulk of the data. This raises suspicion that it may belong to a different…

Methodology · Statistics 2026-04-17 Mia Hubert , Jakob Raymaekers , Peter J. Rousseeuw

Comparison of Outlier Detection Techniques for Structured Data

An outlier is an observation or a data point that is far from rest of the data points in a given dataset or we can be said that an outlier is away from the center of mass of observations. Presence of outliers can skew statistical measures…

Machine Learning · Computer Science 2021-06-17 Amulya Agarwal , Nitin Gupta

Anomaly Detection by Robust Statistics

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for…

Machine Learning · Statistics 2021-01-13 Peter J. Rousseeuw , Mia Hubert

Identifying Outliers in Large Matrices via Randomized Adaptive Compressive Sampling

This paper examines the problem of locating outlier columns in a large, otherwise low-rank, matrix. We propose a simple two-step adaptive sensing and inference approach and establish theoretical guarantees for its performance; our results…

Information Theory · Computer Science 2015-06-22 Xingguo Li , Jarvis Haupt

Outlier detection for mixed-type data: A novel approach

Outlier detection can serve as an extremely important tool for researchers from a wide range of fields. From the sectors of banking and marketing to the social sciences and healthcare sectors, outlier detection techniques are very useful…

Methodology · Statistics 2023-12-12 Efthymios Costa , Ioanna Papatsouma

Outlier Detection by Consistent Data Selection Method

Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or…

Machine Learning · Computer Science 2018-08-22 Utkarsh Porwal , Smruthi Mukund

Handling cellwise outliers by sparse regression and robust covariance

We propose a data-analytic method for detecting cellwise outliers. Given a robust covariance matrix, outlying cells (entries) in a row are found by the cellHandler technique which combines lasso regression with a stepwise application of…

Methodology · Statistics 2024-07-08 Jakob Raymaekers , Peter J. Rousseeuw

DOD: Detection of outliers in high dimensional data with distance of distances

Reliable outlier detection in high-dimensional data is crucial in modern science, yet it remains a challenging task. Traditional methods often break down in these settings due to their reliance on asymptotic behaviors with respect to sample…

Methodology · Statistics 2025-11-05 Seong-ho Lee , Yongho Jeon

Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

In a corpus of data, outliers are either errors: mistakes in the data that are counterproductive, or are unique: informative samples that improve model robustness. Identifying outliers can lead to better datasets by (1) removing noise in…

Computation and Language · Computer Science 2019-04-08 Stefan Larson , Anish Mahendran , Andrew Lee , Jonathan K. Kummerfeld , Parker Hill , Michael A. Laurenzano , Johann Hauswald , Lingjia Tang , Jason Mars

Detection of outlying proportions

In this paper we introduce a new method for detecting outliers in a set of proportions. It is based on the construction of a suitable two-way contingency table and on the application of an algorithm for the detection of outlying cells in…

Methodology · Statistics 2016-08-04 Flavio Mignone , Fabio Rapallo

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Outliers are the points which are different from or inconsistent with the rest of the data. They can be novel, new, abnormal, unusual or noisy information. Outliers are sometimes more interesting than the majority of the data. The main…

Computer Vision and Pattern Recognition · Computer Science 2014-06-20 Singh Vijendra , Pathak Shivani

Outliers in dynamic factor models

Dynamic factor models have a wide range of applications in econometrics and applied economics. The basic motivation resides in their capability of reducing a large set of time series to only few indicators (factors). If the number of time…

Statistics Theory · Mathematics 2009-09-29 Roberto Baragona , Francesco Battaglia

MCODE: Multivariate Conditional Outlier Detection

Outlier detection aims to identify unusual data instances that deviate from expected patterns. The outlier detection is particularly challenging when outliers are context dependent and when they are defined by unusual combinations of…

Artificial Intelligence · Computer Science 2015-05-18 Charmgil Hong , Milos Hauskrecht

An empirical comparison of some outlier detection methods with longitudinal data

This note investigates the problem of detecting outliers in longitudinal data. It compares well-known methods used in official statistics with proposals from the fields of data mining and machine learning that are based on the distance…

Methodology · Statistics 2025-07-30 Marcello D'Orazio

Loss Functions for Detecting Outliers in Panel Data

The detection of outliers is of critical importance in the assurance of data quality. Outliers may exist in observed data or in data derived from these observed data, such as estimates and forecasts. An outlier may indicate a problem with…

Methodology · Statistics 2025-10-23 Charles D. Coleman , Thomas Bryan

Detecting Outliers in High-dimensional Data with Mixed Variable Types using Conditional Gaussian Regression Models

Outlier detection has gained increasing interest in recent years, due to newly emerging technologies and the huge amount of high-dimensional data that are now available. Outlier detection can help practitioners to identify unwanted noise…

Statistics Theory · Mathematics 2021-05-20 Mads Lindskou , Torben Tvedebrink , Poul Svante Eriksen , Niels Morling

Outlier Detection and Spatial Analysis Algorithms

Outlier detection is a significant area in data mining. It can be either used to pre-process the data prior to an analysis or post the processing phase (before visualization) depending on the effectiveness of the outlier and its importance.…

Machine Learning · Statistics 2021-06-22 Jacob John

Outlier Detection in Contingency Tables based on Minimal Patterns

A new technique for the detection of outliers in contingency tables is introduced. Outliers thereby are unexpected cell counts with respect to classical loglinear Poisson models. Subsets of cell counts called minimal patterns are defined,…

Computation · Statistics 2012-11-15 Sonja Kuhnt , Fabio Rapallo , André Rehage

Outlyingness: why do outliers lie out?

Outlier detection is an inevitable step to most statistical data analyses. However, the mere detection of an outlying case does not always answer all scientific questions associated with that data point. Outlier detection techniques,…

Methodology · Statistics 2019-12-12 Michiel Debruyne , Sebastiaan Höppner , Sven Serneels , Tim Verdonck