Related papers: Potential Anchoring for imbalanced data classifica…

Radial-Based Undersampling for Imbalanced Data Classification

Data imbalance remains one of the most widespread problems affecting contemporary machine learning. The negative effect data imbalance can have on the traditional learning algorithms is most severe in combination with other dataset…

Machine Learning · Computer Science 2021-04-20 Michał Koziarski

A Bilevel Optimization Framework for Imbalanced Data Classification

Data rebalancing techniques, including oversampling and undersampling, are a common approach to addressing the challenges of imbalanced data. To tackle unresolved problems related to both oversampling and undersampling, we propose a new…

Machine Learning · Computer Science 2025-07-11 Karen Medlin , Sven Leyffer , Krishnan Raghavan

Restoring balance: principled under/oversampling of data for optimal classification

Class imbalance in real-world data poses a common bottleneck for machine learning tasks, since achieving good generalization on under-represented examples is often challenging. Mitigation strategies, such as under or oversampling the data…

Disordered Systems and Neural Networks · Physics 2025-02-03 Emanuele Loffredo , Mauro Pastore , Simona Cocco , Rémi Monasson

Deep Learning Meets Oversampling: A Learning Framework to Handle Imbalanced Classification

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue,…

Machine Learning · Computer Science 2025-02-12 Sukumar Kishanthan , Asela Hevapathige

Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties

In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation,…

Machine Learning · Computer Science 2022-01-21 Mohamed S. Kraiem , Fernando Sánchez-Hernández , María N. Moreno-García

Learning Confidence Bounds for Classification with Imbalanced Data

Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address…

Machine Learning · Computer Science 2025-10-22 Matt Clifford , Jonathan Erskine , Alexander Hepburn , Raúl Santos-Rodríguez , Dario Garcia-Garcia

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

This paper evaluates six strategies for mitigating imbalanced data: oversampling, undersampling, ensemble methods, specialized algorithms, class weight adjustments, and a no-mitigation approach referred to as the baseline. These strategies…

Machine Learning · Computer Science 2023-11-13 Jacques Wainer

Learning Classifiers for Imbalanced and Overlapping Data

This study is about inducing classifiers using data that is imbalanced, with a minority class being under-represented in relation to the majority classes. The first section of this research focuses on the main characteristics of data that…

Machine Learning · Computer Science 2022-10-25 Shivaditya Shivganesh , Nitin Narayanan N , Pranav Murali , Ajaykumar M

Neural Network Based Undersampling Techniques

Class imbalance problem is commonly faced while developing machine learning models for real-life issues. Due to this problem, the fitted model tends to be biased towards the majority class data, which leads to lower precision, recall, AUC,…

Machine Learning · Computer Science 2019-08-20 Md. Adnan Arefeen , Sumaiya Tabassum Nimi , M Sohel Rahman

Resampling strategies for imbalanced regression: a survey and empirical analysis

Imbalanced problems can arise in different real-world situations, and to address this, certain strategies in the form of resampling or balancing algorithms are proposed. This issue has largely been studied in the context of classification,…

Machine Learning · Computer Science 2025-07-17 Juscimara G. Avelino , George D. C. Cavalcanti , Rafael M. O. Cruz

Stop Oversampling for Class Imbalance Learning: A Critical Review

For the last two decades, oversampling has been employed to overcome the challenge of learning from imbalanced datasets. Many approaches to solving this challenge have been offered in the literature. Oversampling, on the other hand, is a…

Machine Learning · Computer Science 2022-06-09 Ahmad B. Hassanat , Ahmad S. Tarawneh , Ghada A. Altarawneh , Abdullah Almuhaimeed

RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification

Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall…

Machine Learning · Computer Science 2021-05-11 Michał Koziarski , Colin Bellinger , Michał Woźniak

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software, Guidelines and Future Directions

Learning from imbalanced data is among the most challenging areas in contemporary machine learning. This becomes even more difficult when considered the context of big data that calls for dedicated architectures capable of high-performance…

Machine Learning · Computer Science 2022-11-16 William C. Sleeman , Bartosz Krawczyk

VOS: a Method for Variational Oversampling of Imbalanced Data

Class imbalanced datasets are common in real-world applications that range from credit card fraud detection to rare disease diagnostics. Several popular classification algorithms assume that classes are approximately balanced, and hence…

Machine Learning · Statistics 2018-09-10 Val Andrei Fajardo , David Findlay , Roshanak Houmanfar , Charu Jaiswal , Jiaxi Liang , Honglei Xie

Handling Imbalanced Data: A Case Study for Binary Class Problems

For several years till date, the major issues in terms of solving for classification problems are the issues of Imbalanced data. Because majority of the machine learning algorithms by default assumes all data are balanced, the algorithms do…

Machine Learning · Statistics 2020-10-12 Richmond Addo Danquah

Synthetic Oversampling of Multi-Label Data based on Local Label Distribution

Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods. One efficient strategy to deal with this problem is to employ resampling techniques before…

Machine Learning · Computer Science 2021-05-18 Bin Liu , Grigorios Tsoumakas

Imbalanced classification: a paradigm-based review

A common issue for classification in scientific research and industry is the existence of imbalanced classes. When sample sizes of different classes are imbalanced in training data, naively implementing a classification method often leads…

Methodology · Statistics 2021-07-02 Yang Feng , Min Zhou , Xin Tong

A Study imbalance handling by various data sampling methods in binary classification

The purpose of this research report is to present the our learning curve and the exposure to the Machine Learning life cycle, with the use of a Kaggle binary classification data set and taking to explore various techniques from…

Machine Learning · Computer Science 2021-05-25 Mohamed Hamama