Related papers: CUSBoost: Cluster-based Under-sampling with Boosti…

MEBoost: Mixing Estimators with Boosting for Imbalanced Data Classification

Class imbalance problem has been a challenging research problem in the fields of machine learning and data mining as most real life datasets are imbalanced. Several existing machine learning algorithms try to maximize the accuracy…

Machine Learning · Computer Science 2018-06-21 Farshid Rayhan , Sajid Ahmed , Asif Mahbub , Md. Rafsan Jani , Swakkhar Shatabda , Dewan Md. Farid , Chowdhury Mofizur Rahman

LIUBoost : Locality Informed Underboosting for Imbalanced Data Classification

The problem of class imbalance along with class-overlapping has become a major issue in the domain of supervised learning. Most supervised learning algorithms assume equal cardinality of the classes under consideration while optimizing the…

Machine Learning · Computer Science 2017-11-16 Sajid Ahmed , Farshid Rayhan , Asif Mahbub , Md. Rafsan Jani , Swakkhar Shatabda , Dewan Md. Farid , Chowdhury Mofizur Rahman

WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning

Machine learning classifiers often stumble over imbalanced datasets where classes are not equally represented. This inherent bias towards the majority class may result in low accuracy in labeling minority class. Imbalanced learning is…

Machine Learning · Computer Science 2019-11-14 Wenhao Zhang , Ramin Ramezani , Arash Naeim

Classification of Imbalanced Credit scoring data sets Based on Ensemble Method with the Weighted-Hybrid-Sampling

In the era of big data, the utilization of credit-scoring models to determine the credit risk of applicants accurately becomes a trend in the future. The conventional machine learning on credit scoring data sets tends to have poor…

Machine Learning · Statistics 2021-02-10 Xiaofan Liua , Zuoquan Zhanga , Di Wanga

Progressive Boosting for Class Imbalance

Pattern recognition applications often suffer from skewed data distributions between classes, which may vary during operations w.r.t. the design data. Two-class classification systems designed using skewed data tend to recognize the…

Machine Learning · Computer Science 2019-12-02 Roghayeh Soleymani , Eric Granger , Giorgio Fumera

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority…

Machine Learning · Computer Science 2021-03-30 Ayush Tripathi , Rupayan Chakraborty , Sunil Kumar Kopparapu

MixBoost: Synthetic Oversampling with Boosted Mixup for Handling Extreme Imbalance

Training a classification model on a dataset where the instances of one class outnumber those of the other class is a challenging problem. Such imbalanced datasets are standard in real-world situations such as fraud detection, medical…

Machine Learning · Computer Science 2020-09-04 Anubha Kabra , Ayush Chopra , Nikaash Puri , Pinkesh Badjatiya , Sukriti Verma , Piyush Gupta , Balaji K

CSMOUTE: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification

In this paper we propose a novel data-level algorithm for handling data imbalance in the classification task, Synthetic Majority Undersampling Technique (SMUTE). SMUTE leverages the concept of interpolation of nearby instances, previously…

Machine Learning · Computer Science 2021-04-20 Michał Koziarski

The SAMME.C2 algorithm for severely imbalanced multi-class classification

Classification predictive modeling involves the accurate assignment of observations in a dataset to target classes or categories. There is an increasing growth of real-world classification problems with severely imbalanced class…

Machine Learning · Statistics 2022-01-03 Banghee So , Emiliano A. Valdez

AdaCC: Cumulative Cost-Sensitive Boosting for Imbalanced Classification

Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating…

Machine Learning · Computer Science 2022-09-20 Vasileios Iosifidis , Symeon Papadopoulos , Bodo Rosenhahn , Eirini Ntoutsi

Statistical Undersampling with Mutual Information and Support Points

Class imbalance and distributional differences in large datasets present significant challenges for classification tasks machine learning, often leading to biased models and poor predictive performance for minority classes. This work…

Machine Learning · Statistics 2024-12-20 Alex Mak , Shubham Sahoo , Shivani Pandey , Yidan Yue , Linglong Kong

On multi-class learning through the minimization of the confusion matrix norm

In imbalanced multi-class classification problems, the misclassification rate as an error measure may not be a relevant choice. Several methods have been developed where the performance measure retained richer information than the mere…

Machine Learning · Computer Science 2013-11-05 Sokol Koço , Cécile Capponi

The Many Faces of Optimal Weak-to-Strong Learning

Boosting is an extremely successful idea, allowing one to combine multiple low accuracy classifiers into a much more accurate voting classifier. In this work, we present a new and surprisingly simple Boosting algorithm that obtains a…

Machine Learning · Computer Science 2024-09-02 Mikael Møller Høgsgaard , Kasper Green Larsen , Markus Engelund Mathiasen

A binary PSO based ensemble under-sampling model for rebalancing imbalanced training data

Ensemble technique and under-sampling technique are both effective tools used for imbalanced dataset classification problems. In this paper, a novel ensemble method combining the advantages of both ensemble learning for biasing classifiers…

Machine Learning · Computer Science 2025-02-05 Jinyan Li , Yaoyang Wu , Simon Fong , Antonio J. Tallón-Ballesteros , Xin-she Yang , Sabah Mohammed , Feng Wu

Minority Class Oversampling for Tabular Data with Deep Generative Models

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A…

Machine Learning · Computer Science 2020-07-21 Ramiro Camino , Christian Hammerschmidt , Radu State

Tree Boosting Methods for Balanced andImbalanced Classification and their Robustness Over Time in Risk Assessment

Most real-world classification problems deal with imbalanced datasets, posing a challenge for Artificial Intelligence (AI), i.e., machine learning algorithms, because the minority class, which is of extreme interest, often proves difficult…

Machine Learning · Computer Science 2025-04-28 Gissel Velarde , Michael Weichert , Anuj Deshmunkh , Sanjay Deshmane , Anindya Sudhir , Khushboo Sharma , Vaibhav Joshi

Data Balancing Strategies: A Systematic Survey of Resampling and Augmentation Methods

Imbalanced datasets, where one class significantly outnumbers others, remain a persistent challenge in machine learning, often biasing predictions toward the majority class and degrading classifier performance. This paper provides a…

Machine Learning · Statistics 2026-04-30 Behnam Yousefimehr , Mehdi Ghatee , Javad Fazli , Shervin Ghaffari , Zahra Rafei , Mohammad Amin Seifi , Sajed Tavakoli , Abolfazl Nikahd , Mahdi Razi Gandomani , Alireza Orouji , Ramtin Mahmoudi Kashani , Sarina Heshmati , Negin Sadat Mousavi

A Novel Hybrid Sampling Framework for Imbalanced Learning

Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a…

Machine Learning · Computer Science 2022-08-23 Asif Newaz , Farhan Shahriyar Haq

CGMOS: Certainty Guided Minority OverSampling

Handling imbalanced datasets is a challenging problem that if not treated correctly results in reduced classification performance. Imbalanced datasets are commonly handled using minority oversampling, whereas the SMOTE algorithm is a…

Machine Learning · Computer Science 2016-07-25 Xi Zhang , Di Ma , Lin Gan , Shanshan Jiang , Gady Agam