Related papers: Dealing with Class Imbalance using Thresholding

OTLP: Output Thresholding Using Mixed Integer Linear Programming

Output thresholding is the technique to search for the best threshold to be used during inference for any classifiers that can produce probability estimates on train and testing datasets. It is particularly useful in high imbalance…

Machine Learning · Computer Science 2024-05-21 Baran Koseoglu , Luca Traverso , Mohammed Topiwalla , Egor Kraev , Zoltan Szopory

Review of Methods for Handling Class-Imbalanced in Classification Problems

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more…

Machine Learning · Computer Science 2022-11-11 Satyendra Singh Rawat , Amit Kumar Mishra

Outlier Detection as Instance Selection Method for Feature Selection in Time Series Classification

In order to allow machine learning algorithms to extract knowledge from raw data, these data must first be cleaned, transformed, and put into machine-appropriate form. These often very time-consuming phase is referred to as preprocessing.…

Machine Learning · Computer Science 2021-11-19 David Cemernek

An "outside the box" solution for imbalanced data classification

A common problem of the real-world data sets is the class imbalance, which can significantly affect the classification abilities of classifiers. Numerous methods have been proposed to cope with this problem; however, even state-of-the-art…

Machine Learning · Computer Science 2019-11-19 Hubert Jegierski , Stanisław Saganowski

Entropic one-class classifiers

The one-class classification problem is a well-known research endeavor in pattern recognition. The problem is also known under different names, such as outlier and novelty/anomaly detection. The core of the problem consists in modeling and…

Computer Vision and Pattern Recognition · Computer Science 2015-03-31 Lorenzo Livi , Alireza Sadeghian , Witold Pedrycz

Rethinking Unsupervised Outlier Detection via Multiple Thresholding

In the realm of unsupervised image outlier detection, assigning outlier scores holds greater significance than its subsequent task: thresholding for predicting labels. This is because determining the optimal threshold on non-separable…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Zhonghang Liu , Panzhong Lu , Guoyang Xie , Zhichao Lu , Wen-Yan Lin

Calibrating Black Box Classification Models through the Thresholding Method

In high-dimensional classification settings, we wish to seek a balance between high power and ensuring control over a desired loss function. In many settings, the points most likely to be misclassified are those who lie near the decision…

Machine Learning · Statistics 2017-06-06 Arun Srinivasan

New Hard-thresholding Rules based on Data Splitting in High-dimensional Imbalanced Classification

In binary classification, imbalance refers to situations in which one class is heavily under-represented. This issue is due to either a data collection process or because one class is indeed rare in a population. Imbalanced classification…

Methodology · Statistics 2022-01-07 Arezou Mojiri , Abbas Khalili , Ali Zeinal Hamadani

Deep Learning Meets Oversampling: A Learning Framework to Handle Imbalanced Classification

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue,…

Machine Learning · Computer Science 2025-02-12 Sukumar Kishanthan , Asela Hevapathige

On Model Evaluation under Non-constant Class Imbalance

Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals…

Machine Learning · Computer Science 2020-04-16 Jan Brabec , Tomáš Komárek , Vojtěch Franc , Lukáš Machlica

Cellwise Outliers

In statistics and machine learning, the traditional meaning of the terms `outlier' and `anomaly' is a case in the dataset that behaves differently from the bulk of the data. This raises suspicion that it may belong to a different…

Methodology · Statistics 2026-04-17 Mia Hubert , Jakob Raymaekers , Peter J. Rousseeuw

A robust approach to model-based classification based on trimming and constraints

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…

Applications · Statistics 2019-11-20 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Boundary Peeling: Outlier Detection Method Using One-Class Peeling

Unsupervised outlier detection constitutes a crucial phase within data analysis and remains a dynamic realm of research. A good outlier detection algorithm should be computationally efficient, robust to tuning parameter selection, and…

Machine Learning · Statistics 2024-09-23 Sheikh Arafat , Na Sun , Maria L. Weese , Waldyn G. Martinez

Rethinking Class Imbalance in Machine Learning

Imbalance learning is a subfield of machine learning that focuses on learning tasks in the presence of class imbalance. Nearly all existing studies refer to class imbalance as a proportion imbalance, where the proportion of training samples…

Machine Learning · Computer Science 2023-05-09 Ou Wu

An Overview and a Benchmark of Active Learning for Outlier Detection with One-Class Classifiers

Active learning methods increase classification quality by means of user feedback. An important subcategory is active learning for outlier detection with one-class classifiers. While various methods in this category exist, selecting one for…

Machine Learning · Computer Science 2019-05-15 Holger Trittenbach , Adrian Englhardt , Klemens Böhm

Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics

The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are…

Statistics Theory · Mathematics 2021-09-21 Myrto Limnios , Nathan Noiry , Stéphan Clémençon

MCODE: Multivariate Conditional Outlier Detection

Outlier detection aims to identify unusual data instances that deviate from expected patterns. The outlier detection is particularly challenging when outliers are context dependent and when they are defined by unusual combinations of…

Artificial Intelligence · Computer Science 2015-05-18 Charmgil Hong , Milos Hauskrecht

A method for outlier detection based on cluster analysis and visual expert criteria

Outlier detection is an important problem occurring in a wide range of areas. Outliers are the outcome of fraudulent behaviour, mechanical faults, human error, or simply natural deviations. Many data mining applications perform outlier…

Machine Learning · Computer Science 2025-10-28 Juan A. Lara , David Lizcano , Víctor Rampérez , Javier Soriano

Detection of outlying proportions

In this paper we introduce a new method for detecting outliers in a set of proportions. It is based on the construction of a suitable two-way contingency table and on the application of an algorithm for the detection of outlying cells in…

Methodology · Statistics 2016-08-04 Flavio Mignone , Fabio Rapallo

Toward Scalable and Unified Example-based Explanation and Outlier Detection

When neural networks are employed for high-stakes decision-making, it is desirable that they provide explanations for their prediction in order for us to understand the features that have contributed to the decision. At the same time, it is…

Machine Learning · Computer Science 2022-05-10 Penny Chong , Ngai-Man Cheung , Yuval Elovici , Alexander Binder