Related papers: Learning Classifiers for Imbalanced and Overlappin…

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall

Survey of resampling techniques for improving classification performance in unbalanced datasets

A number of classification problems need to deal with data imbalance between classes. Often it is desired to have a high recall on the minority class while maintaining a high precision on the majority class. In this paper, we review a…

Applications · Statistics 2016-08-23 Ajinkya More

Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties

In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation,…

Machine Learning · Computer Science 2022-01-21 Mohamed S. Kraiem , Fernando Sánchez-Hernández , María N. Moreno-García

Imbalanced Classification via Explicit Gradient Learning From Augmented Data

Learning from imbalanced data is one of the most significant challenges in real-world classification tasks. In such cases, neural networks performance is substantially impaired due to preference towards the majority class. Existing…

Machine Learning · Computer Science 2022-11-13 Bronislav Yasinnik , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

Deep Learning Meets Oversampling: A Learning Framework to Handle Imbalanced Classification

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue,…

Machine Learning · Computer Science 2025-02-12 Sukumar Kishanthan , Asela Hevapathige

Stop Oversampling for Class Imbalance Learning: A Critical Review

For the last two decades, oversampling has been employed to overcome the challenge of learning from imbalanced datasets. Many approaches to solving this challenge have been offered in the literature. Oversampling, on the other hand, is a…

Machine Learning · Computer Science 2022-06-09 Ahmad B. Hassanat , Ahmad S. Tarawneh , Ghada A. Altarawneh , Abdullah Almuhaimeed

Review of Methods for Handling Class-Imbalanced in Classification Problems

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more…

Machine Learning · Computer Science 2022-11-11 Satyendra Singh Rawat , Amit Kumar Mishra

Class Imbalance Problem in Data Mining Review

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of…

Machine Learning · Computer Science 2013-05-09 Rushi Longadge , Snehalata Dongre

Restoring balance: principled under/oversampling of data for optimal classification

Class imbalance in real-world data poses a common bottleneck for machine learning tasks, since achieving good generalization on under-represented examples is often challenging. Mitigation strategies, such as under or oversampling the data…

Disordered Systems and Neural Networks · Physics 2025-02-03 Emanuele Loffredo , Mauro Pastore , Simona Cocco , Rémi Monasson

A Survey of Methods for Managing the Classification and Solution of Data Imbalance Problem

The problem of class imbalance is extensive for focusing on numerous applications in the real world. In such a situation, nearly all of the examples are labeled as one class called majority class, while far fewer examples are labeled as the…

Machine Learning · Computer Science 2020-12-23 Khan Md. Hasib , Md. Sadiq Iqbal , Faisal Muhammad Shah , Jubayer Al Mahmud , Mahmudul Hasan Popel , Md. Imran Hossain Showrov , Shakil Ahmed , Obaidur Rahman

Bridging the Gap: Simultaneous Fine Tuning for Data Re-Balancing

There are many real-world classification problems wherein the issue of data imbalance (the case when a data set contains substantially more samples for one/many classes than the rest) is unavoidable. While under-sampling the problematic…

Computer Vision and Pattern Recognition · Computer Science 2018-01-09 John McKay , Isaac Gerg , Vishal Monga

Minority Class Oversampling for Tabular Data with Deep Generative Models

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A…

Machine Learning · Computer Science 2020-07-21 Ramiro Camino , Christian Hammerschmidt , Radu State

Influence of Resampling on Accuracy of Imbalanced Classification

In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally, accurate…

Machine Learning · Statistics 2017-07-14 Evgeny Burnaev , Pavel Erofeev , Artem Papanov

Self-paced Ensemble for Highly Imbalanced Massive Data Classification

Many real-world applications reveal difficulties in learning classifiers from imbalanced data. The rising big data era has been witnessing more classification tasks with large-scale but extremely imbalance and low-quality datasets. Most of…

Machine Learning · Computer Science 2020-10-20 Zhining Liu , Wei Cao , Zhifeng Gao , Jiang Bian , Hechang Chen , Yi Chang , Tie-Yan Liu

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

Handling Imbalanced Data: A Case Study for Binary Class Problems

For several years till date, the major issues in terms of solving for classification problems are the issues of Imbalanced data. Because majority of the machine learning algorithms by default assumes all data are balanced, the algorithms do…

Machine Learning · Statistics 2020-10-12 Richmond Addo Danquah

A Bilevel Optimization Framework for Imbalanced Data Classification

Data rebalancing techniques, including oversampling and undersampling, are a common approach to addressing the challenges of imbalanced data. To tackle unresolved problems related to both oversampling and undersampling, we propose a new…

Machine Learning · Computer Science 2025-07-11 Karen Medlin , Sven Leyffer , Krishnan Raghavan

Learning Confidence Bounds for Classification with Imbalanced Data

Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address…

Machine Learning · Computer Science 2025-10-22 Matt Clifford , Jonathan Erskine , Alexander Hepburn , Raúl Santos-Rodríguez , Dario Garcia-Garcia

Fair Oversampling Technique using Heterogeneous Clusters

Class imbalance and group (e.g., race, gender, and age) imbalance are acknowledged as two reasons in data that hinder the trade-off between fairness and utility of machine learning classifiers. Existing techniques have jointly addressed…

Machine Learning · Computer Science 2023-05-24 Ryosuke Sonoda

Synthetic Oversampling of Multi-Label Data based on Local Label Distribution

Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods. One efficient strategy to deal with this problem is to employ resampling techniques before…

Machine Learning · Computer Science 2021-05-18 Bin Liu , Grigorios Tsoumakas