Related papers: ChauBoxplot and AdaptiveBoxplot: Two R packages fo…

When Tukey meets Chauvenet: a new boxplot criterion for outlier detection

The box-and-whisker plot, introduced by Tukey (1977), is one of the most popular graphical methods in descriptive statistics. On the other hand, however, Tukey's boxplot is free of sample size, yielding the so-called "one-size-fits-all"…

Methodology · Statistics 2025-06-10 Hongmei Lin , Riquan Zhang , Tiejun Tong

Unifying Boxplots: A Multiple Testing Perspective

Tukey's boxplot is a foundational tool for exploratory data analysis, but its classic outlier-flagging rule does not account for the sample size, and subsequent modifications have often been presented as separate, heuristic adjustments. In…

Methodology · Statistics 2025-10-24 Bowen Gang , Hongmei Lin , Tiejun Tong

The Bag-and-Whisker Plot: A New Bagplot for Bivariate Data

The bagplot, also known as the "bag-and-bolster plot", is a notable extension of the boxplot from univariate to bivariate data. Although widely used, its practical application is hindered by two key limitations: the fixed inflation factor…

Methodology · Statistics 2025-12-09 Shenghao Qin , Bowen Gang , Tiejun Tong , Hengjian Cui

Outlier detection and a tail-adjusted boxplot based on extreme value theory

Whether an extreme observation is an outlier or not, depends strongly on the corresponding tail behaviour of the underlying distribution. We develop an automatic, data-driven method to identify extreme tail behaviour that deviates from the…

Methodology · Statistics 2019-12-06 Shrijita Bhattacharya , Jan Beirlant

ggskewboxplots: Enhanced Boxplots for Skewed Data in R

Traditional boxplots are widely used for summarizing and visualizing the distribution of numerical data, yet they exhibit significant limitations when applied to skewed or heavy-tailed distributions, often leading to misclassification of…

Methodology · Statistics 2025-11-24 Mustafa Cavus

High-dimensional outlier detection using random projections

There exist multiple methods to detect outliers in multivariate data in the literature, but most of them require to estimate the covariance matrix. The higher the dimension, the more complex the estimation of the matrix becoming impossible…

Methodology · Statistics 2020-12-01 P. Navarro-Esteban , J. A. Cuesta-Albertos

RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution Samples

In recent years, there have been significant improvements in various forms of image outlier detection. However, outlier detection performance under adversarial settings lags far behind that in standard settings. This is due to the lack of…

Computer Vision and Pattern Recognition · Computer Science 2025-01-29 Hossein Mirzaei , Mohammad Jafari , Hamid Reza Dehbashi , Ali Ansari , Sepehr Ghobadi , Masoud Hadi , Arshia Soltani Moakhar , Mohammad Azizmalayeri , Mahdieh Soleymani Baghshah , Mohammad Hossein Rohban

Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular Balls

Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data and has been extensively studied and used in a variety of practical tasks. However, most unsupervised…

Machine Learning · Computer Science 2025-01-07 Can Gao , Xiaofeng Tan , Jie Zhou , Weiping Ding , Witold Pedrycz

Detecting and Classifying Outliers in Big Functional Data

We propose two new outlier detection methods, for identifying and classifying different types of outliers in (big) functional data sets. The proposed methods are based on an existing method called Massive Unsupervised Outlier Detection…

Methodology · Statistics 2021-10-15 Oluwasegun Taiwo Ojo , Antonio Fernández Anta , Rosa E. Lillo , Carlo Sguera

REPPlab: An R package for detecting clusters and outliers using exploratory projection pursuit

The R-package REPPlab is designed to explore multivariate data sets using one-dimensional unsupervised projection pursuit. It is useful in practice as a preprocessing step to find clusters or as an outlier detection tool for multivariate…

Computation · Statistics 2024-09-10 Daniel Fischer , Alain Berro , Klaus Nordhausen , Anne Ruiz-Gazen

Boundary Peeling: Outlier Detection Method Using One-Class Peeling

Unsupervised outlier detection constitutes a crucial phase within data analysis and remains a dynamic realm of research. A good outlier detection algorithm should be computationally efficient, robust to tuning parameter selection, and…

Machine Learning · Statistics 2024-09-23 Sheikh Arafat , Na Sun , Maria L. Weese , Waldyn G. Martinez

Outlier Detection for Functional Data with R Package fdaoutlier

Outlier detection is one of the standard exploratory analysis tasks in functional data analysis. We present the R package fdaoutlier which contains implementations of some of the latest techniques for detecting functional outliers. The…

Computation · Statistics 2021-10-15 Oluwasegun Ojo , Rosa E. Lillo , Antonio Fernández Anta

RobPy: a Python Package for Robust Statistical Methods

Robust estimation provides essential tools for analyzing data that contain outliers, ensuring that statistical models remain reliable even in the presence of some anomalous data. While robust methods have long been available in R, users of…

Computation · Statistics 2024-11-05 Sarah Leyder , Jakob Raymaekers , Peter J. Rousseeuw , Thomas Servotte , Tim Verdonck

RCC-Dual-GAN: An Efficient Approach for Outlier Detection with Few Identified Anomalies

Outlier detection is an important task in data mining and many technologies have been explored in various applications. However, due to the default assumption that outliers are non-concentrated, unsupervised outlier detection may not…

Machine Learning · Computer Science 2020-03-10 Zhe Li , Chunhua Sun , Chunli Liu , Xiayu Chen , Meng Wang , Yezheng Liu

An experimental study of existing tools for outlier detection and cleaning in trajectories

Outlier detection and cleaning are essential steps in data preprocessing to ensure the integrity and validity of data analyses. This paper focuses on outlier points within individual trajectories, i.e., points that deviate significantly…

Databases · Computer Science 2025-11-26 Mariana M Garcez Duarte , Mahmoud Sakr

Detecting Outliers in High-dimensional Data with Mixed Variable Types using Conditional Gaussian Regression Models

Outlier detection has gained increasing interest in recent years, due to newly emerging technologies and the huge amount of high-dimensional data that are now available. Outlier detection can help practitioners to identify unwanted noise…

Statistics Theory · Mathematics 2021-05-20 Mads Lindskou , Torben Tvedebrink , Poul Svante Eriksen , Niels Morling

Identification of Outlying Observations with Quantile Regression for Censored Data

Outlying observations, which significantly deviate from other measurements, may distort the conclusions of data analysis. Therefore, identifying outliers is one of the important problems that should be solved to obtain reliable results.…

Computation · Statistics 2014-05-01 Soo-Heang Eo , Seung-Mo Hong , HyungJun Cho

Outlier Detection with Cluster Catch Digraphs

This paper introduces a novel family of outlier detection algorithms based on Cluster Catch Digraphs (CCDs), specifically tailored to address the challenges of high dimensionality and varying cluster shapes, which deteriorate the…

Machine Learning · Statistics 2024-10-10 Rui Shi , Nedret Billor , Elvan Ceyhan

Anomaly Detection by Robust Statistics

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for…

Machine Learning · Statistics 2021-01-13 Peter J. Rousseeuw , Mia Hubert

CoMadOut -- A Robust Outlier Detection Algorithm based on CoMAD

Unsupervised learning methods are well established in the area of anomaly detection and achieve state of the art performances on outlier datasets. Outliers play a significant role, since they bear the potential to distort the predictions of…

Machine Learning · Computer Science 2024-07-02 Andreas Lohrer , Daniyal Kazempour , Maximilian Hünemörder , Peer Kröger