Related papers: Gamma distribution-based sampling for imbalanced d…

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority…

Machine Learning · Computer Science 2021-03-30 Ayush Tripathi , Rupayan Chakraborty , Sunil Kumar Kopparapu

Minority Class Oversampling for Tabular Data with Deep Generative Models

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A…

Machine Learning · Computer Science 2020-07-21 Ramiro Camino , Christian Hammerschmidt , Radu State

Imbalanced data preprocessing techniques utilizing local data characteristics

Data imbalance, that is the disproportion between the number of training observations coming from different classes, remains one of the most significant challenges affecting contemporary machine learning. The negative impact of data…

Machine Learning · Computer Science 2021-11-30 Michał Koziarski

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to…

Machine Learning · Computer Science 2020-03-06 Felix Last , Georgios Douzas , Fernando Bacao

A Novel Hybrid Sampling Framework for Imbalanced Learning

Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a…

Machine Learning · Computer Science 2022-08-23 Asif Newaz , Farhan Shahriyar Haq

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall

GenSample: A Genetic Algorithm for Oversampling in Imbalanced Datasets

Imbalanced datasets are ubiquitous. Classification performance on imbalanced datasets is generally poor for the minority class as the classifier cannot learn decision boundaries well. However, in sensitive applications like fraud detection,…

Machine Learning · Computer Science 2019-10-25 Vishwa Karia , Wenhao Zhang , Arash Naeim , Ramin Ramezani

A Synthetic Over-sampling method with Minority and Majority classes for imbalance problems

Class imbalance is a substantial challenge in classifying many real-world cases. Synthetic over-sampling methods have been effective to improve the performance of classifiers for imbalance problems. However, most synthetic over-sampling…

Machine Learning · Computer Science 2021-08-11 Hadi A. Khorshidi , Uwe Aickelin

Data Balancing Strategies: A Systematic Survey of Resampling and Augmentation Methods

Imbalanced datasets, where one class significantly outnumbers others, remain a persistent challenge in machine learning, often biasing predictions toward the majority class and degrading classifier performance. This paper provides a…

Machine Learning · Statistics 2026-04-30 Behnam Yousefimehr , Mehdi Ghatee , Javad Fazli , Shervin Ghaffari , Zahra Rafei , Mohammad Amin Seifi , Sajed Tavakoli , Abolfazl Nikahd , Mahdi Razi Gandomani , Alireza Orouji , Ramtin Mahmoudi Kashani , Sarina Heshmati , Negin Sadat Mousavi

A Quantum Approach to Synthetic Minority Oversampling Technique (SMOTE)

The paper proposes the Quantum-SMOTE method, a novel solution that uses quantum computing techniques to solve the prevalent problem of class imbalance in machine learning datasets. Quantum-SMOTE, inspired by the Synthetic Minority…

Quantum Physics · Physics 2025-03-31 Nishikanta Mohanty , Bikash K. Behera , Christopher Ferrie , Pravat Dash

Handling Imbalanced Data: A Case Study for Binary Class Problems

For several years till date, the major issues in terms of solving for classification problems are the issues of Imbalanced data. Because majority of the machine learning algorithms by default assumes all data are balanced, the algorithms do…

Machine Learning · Statistics 2020-10-12 Richmond Addo Danquah

Spam filtering on forums: A synthetic oversampling based approach for imbalanced data classification

Forums play an important role in providing a platform for community interaction. The introduction of irrelevant content or spam by individuals for commercial and social gains tends to degrade the professional experience presented to the…

Information Retrieval · Computer Science 2019-09-12 Pratik Ratadiya , Rahul Moorthy

Simplicial SMOTE: Oversampling Solution to the Imbalanced Learning Problem

SMOTE (Synthetic Minority Oversampling Technique) is the established geometric approach to random oversampling to balance classes in the imbalanced learning problem, followed by many extensions. Its idea is to introduce synthetic data…

Machine Learning · Computer Science 2025-03-06 Oleg Kachan , Andrey Savchenko , Gleb Gusev

SMOTE: Synthetic Minority Over-sampling Technique

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed…

Artificial Intelligence · Computer Science 2011-11-25 N. V. Chawla , K. W. Bowyer , L. O. Hall , W. P. Kegelmeyer

Kernel-Based Enhanced Oversampling Method for Imbalanced Classification

This paper introduces a novel oversampling technique designed to improve classification performance on imbalanced datasets. The proposed method enhances the traditional SMOTE algorithm by incorporating convex combination and kernel-based…

Machine Learning · Computer Science 2025-04-15 Wenjie Li , Sibo Zhu , Zhijian Li , Hanlin Wang

Efficient Hybrid Oversampling and Intelligent Undersampling for Imbalanced Big Data Classification

Imbalanced classification is a well-known challenge faced by many real-world applications. This issue occurs when the distribution of the target variable is skewed, leading to a prediction bias toward the majority class. With the arrival of…

Machine Learning · Computer Science 2023-10-10 Carla Vairetti , José Luis Assadi , Sebastián Maldonado

A Novel Resampling Technique for Imbalanced Dataset Optimization

Despite the enormous amount of data, particular events of interest can still be quite rare. Classification of rare events is a common problem in many domains, such as fraudulent transactions, malware traffic analysis and network intrusion…

Machine Learning · Computer Science 2021-01-01 Ivan Letteri , Antonio Di Cecco , Abeer Dyoub , Giuseppe Della Penna

Classification Imbalance as Transfer Learning

Classification imbalance arises when one class is much rarer than the other. We frame this setting as transfer learning under label (prior) shift between an imbalanced source distribution induced by the observed data and a balanced target…

Machine Learning · Statistics 2026-01-16 Eric Xia , Jason M. Klusowski

Kernel density estimation based sampling for imbalanced class distribution

Imbalanced response variable distribution is a common occurrence in data science. In fields such as fraud detection, medical diagnostics, system intrusion detection and many others where abnormal behavior is rarely observed the data under…

Machine Learning · Computer Science 2019-11-21 Firuz Kamalov

STEM Rebalance: A Novel Approach for Tackling Imbalanced Datasets using SMOTE, Edited Nearest Neighbour, and Mixup

Imbalanced datasets in medical imaging are characterized by skewed class proportions and scarcity of abnormal cases. When trained using such data, models tend to assign higher probabilities to normal cases, leading to biased performance.…

Machine Learning · Computer Science 2023-11-14 Yumnah Hasan , Fatemeh Amerehi , Patrick Healy , Conor Ryan