English
Related papers

Related papers: Handling Imbalanced Data: A Case Study for Binary …

200 papers

Data imbalance, that is the disproportion between the number of training observations coming from different classes, remains one of the most significant challenges affecting contemporary machine learning. The negative impact of data…

Machine Learning · Computer Science 2021-11-30 Michał Koziarski

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling…

Machine Learning · Computer Science 2024-10-01 Mohamed Abdelhamid , Abhyuday Desai

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority…

Machine Learning · Computer Science 2021-03-30 Ayush Tripathi , Rupayan Chakraborty , Sunil Kumar Kopparapu

There are many real-world classification problems wherein the issue of data imbalance (the case when a data set contains substantially more samples for one/many classes than the rest) is unavoidable. While under-sampling the problematic…

Computer Vision and Pattern Recognition · Computer Science 2018-01-09 John McKay , Isaac Gerg , Vishal Monga

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of…

Machine Learning · Computer Science 2013-05-09 Rushi Longadge , Snehalata Dongre

For the last two decades, oversampling has been employed to overcome the challenge of learning from imbalanced datasets. Many approaches to solving this challenge have been offered in the literature. Oversampling, on the other hand, is a…

Machine Learning · Computer Science 2022-06-09 Ahmad B. Hassanat , Ahmad S. Tarawneh , Ghada A. Altarawneh , Abdullah Almuhaimeed

Traditionally, in supervised machine learning, (a significant) part of the available data (usually 50% to 80%) is used for training and the rest for validation. In many problems, however, the data is highly imbalanced in regard to different…

Machine Learning · Computer Science 2020-04-21 Xiaowei Gu , Plamen P Angelov , Eduardo Almeida Soares

Class imbalance is a substantial challenge in classifying many real-world cases. Synthetic over-sampling methods have been effective to improve the performance of classifiers for imbalance problems. However, most synthetic over-sampling…

Machine Learning · Computer Science 2021-08-11 Hadi A. Khorshidi , Uwe Aickelin

Imbalanced data, where the positive samples represent only a small proportion compared to the negative samples, makes it challenging for classification problems to balance the false positive and false negative rates. A common approach to…

Machine Learning · Statistics 2026-02-17 Pengfei Lyu , Zhengchi Ma , Linjun Zhang , Anru R. Zhang

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to…

Machine Learning · Computer Science 2020-03-06 Felix Last , Georgios Douzas , Fernando Bacao

Imbalanced classification and spurious correlation are common challenges in data science and machine learning. Both issues are linked to data imbalance, with certain groups of data samples significantly underrepresented, which in turn would…

Machine Learning · Statistics 2026-02-10 Ryumei Nakada , Yichen Xu , Lexin Li , Linjun Zhang

Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic…

Machine Learning · Computer Science 2022-08-29 Daochen Zha , Kwei-Herng Lai , Qiaoyu Tan , Sirui Ding , Na Zou , Xia Hu

Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models for achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a…

The purpose of this research report is to present the our learning curve and the exposure to the Machine Learning life cycle, with the use of a Kaggle binary classification data set and taking to explore various techniques from…

Machine Learning · Computer Science 2021-05-25 Mohamed Hamama

Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue,…

Machine Learning · Computer Science 2025-02-12 Sukumar Kishanthan , Asela Hevapathige

In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation,…

Machine Learning · Computer Science 2022-01-21 Mohamed S. Kraiem , Fernando Sánchez-Hernández , María N. Moreno-García

The problem of class imbalance is extensive for focusing on numerous applications in the real world. In such a situation, nearly all of the examples are labeled as one class called majority class, while far fewer examples are labeled as the…

This study examines the impact of class-imbalanced data on deep learning models and proposes a technique for data balancing by generating synthetic data for the minority class. Unlike random-based oversampling, our method prioritizes…

Machine Learning · Computer Science 2024-02-26 Hung Nguyen , Morris Chang

Synthetic oversampling of minority examples using SMOTE and its variants is a leading strategy for addressing imbalanced classification problems. Despite the success of this approach in practice, its theoretical foundations remain…

Machine Learning · Statistics 2025-10-24 Touqeer Ahmad , Mohammadreza M. Kalan , François Portier , Gilles Stupfler
‹ Prev 1 2 3 10 Next ›