Related papers: Improved Sampling Techniques for Learning an Imbal…

A Novel Hybrid Sampling Framework for Imbalanced Learning

Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a…

Machine Learning · Computer Science 2022-08-23 Asif Newaz , Farhan Shahriyar Haq

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority…

Machine Learning · Computer Science 2021-03-30 Ayush Tripathi , Rupayan Chakraborty , Sunil Kumar Kopparapu

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to…

Machine Learning · Computer Science 2020-03-06 Felix Last , Georgios Douzas , Fernando Bacao

SMOTE: Synthetic Minority Over-sampling Technique

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed…

Artificial Intelligence · Computer Science 2011-11-25 N. V. Chawla , K. W. Bowyer , L. O. Hall , W. P. Kegelmeyer

Data Balancing Strategies: A Systematic Survey of Resampling and Augmentation Methods

Imbalanced datasets, where one class significantly outnumbers others, remain a persistent challenge in machine learning, often biasing predictions toward the majority class and degrading classifier performance. This paper provides a…

Machine Learning · Statistics 2026-04-30 Behnam Yousefimehr , Mehdi Ghatee , Javad Fazli , Shervin Ghaffari , Zahra Rafei , Mohammad Amin Seifi , Sajed Tavakoli , Abolfazl Nikahd , Mahdi Razi Gandomani , Alireza Orouji , Ramtin Mahmoudi Kashani , Sarina Heshmati , Negin Sadat Mousavi

Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO

Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced…

Machine Learning · Computer Science 2022-10-25 Md Manjurul Ahsan , Md Shahin Ali , Zahed Siddique

Predicting class-imbalanced business risk using resampling, regularization, and model ensembling algorithms

We aim at developing and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, and ensembling techniques. Area Under the Receiver Operating…

Machine Learning · Statistics 2019-03-14 Yan Wang , Xuelei Sherry Ni

CSMOUTE: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification

In this paper we propose a novel data-level algorithm for handling data imbalance in the classification task, Synthetic Majority Undersampling Technique (SMUTE). SMUTE leverages the concept of interpolation of nearby instances, previously…

Machine Learning · Computer Science 2021-04-20 Michał Koziarski

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

This paper evaluates six strategies for mitigating imbalanced data: oversampling, undersampling, ensemble methods, specialized algorithms, class weight adjustments, and a no-mitigation approach referred to as the baseline. These strategies…

Machine Learning · Computer Science 2023-11-13 Jacques Wainer

Benchmark of Data Preprocessing Methods for Imbalanced Classification

Severe class imbalance is one of the main conditions that make machine learning in cybersecurity difficult. A variety of dataset preprocessing methods have been introduced over the years. These methods modify the training dataset by…

Machine Learning · Computer Science 2023-03-07 Radovan Haluška , Jan Brabec , Tomáš Komárek

LoRAS: An oversampling approach for imbalanced datasets

The Synthetic Minority Oversampling TEchnique (SMOTE) is widely-used for the analysis of imbalanced datasets. It is known that SMOTE frequently over-generalizes the minority class, leading to misclassifications for the majority class, and…

Machine Learning · Computer Science 2020-08-18 Saptarshi Bej , Narek Davtyan , Markus Wolfien , Mariam Nassar , Olaf Wolkenhauer

G-SMOTE: A GMM-based synthetic minority oversampling technique for imbalanced learning

Imbalanced Learning is an important learning algorithm for the classification models, which have enjoyed much popularity on many applications. Typically, imbalanced learning algorithms can be partitioned into two types, i.e., data level…

Machine Learning · Computer Science 2018-10-25 Tianlun Zhang , Xi Yang

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling…

Machine Learning · Computer Science 2024-10-01 Mohamed Abdelhamid , Abhyuday Desai

Kernel-Based Enhanced Oversampling Method for Imbalanced Classification

This paper introduces a novel oversampling technique designed to improve classification performance on imbalanced datasets. The proposed method enhances the traditional SMOTE algorithm by incorporating convex combination and kernel-based…

Machine Learning · Computer Science 2025-04-15 Wenjie Li , Sibo Zhu , Zhijian Li , Hanlin Wang

Minority Class Oversampling for Tabular Data with Deep Generative Models

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A…

Machine Learning · Computer Science 2020-07-21 Ramiro Camino , Christian Hammerschmidt , Radu State

WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning

Machine learning classifiers often stumble over imbalanced datasets where classes are not equally represented. This inherent bias towards the majority class may result in low accuracy in labeling minority class. Imbalanced learning is…

Machine Learning · Computer Science 2019-11-14 Wenhao Zhang , Ramin Ramezani , Arash Naeim

Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning

Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic…

Machine Learning · Computer Science 2022-08-29 Daochen Zha , Kwei-Herng Lai , Qiaoyu Tan , Sirui Ding , Na Zou , Xia Hu

CGMOS: Certainty Guided Minority OverSampling

Handling imbalanced datasets is a challenging problem that if not treated correctly results in reduced classification performance. Imbalanced datasets are commonly handled using minority oversampling, whereas the SMOTE algorithm is a…

Machine Learning · Computer Science 2016-07-25 Xi Zhang , Di Ma , Lin Gan , Shanshan Jiang , Gady Agam

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall