Related papers: iBRF: Improved Balanced Random Forest Classifier

A Novel Hybrid Sampling Framework for Imbalanced Learning

Class imbalance is a frequently occurring scenario in classification tasks. Learning from imbalanced data poses a major challenge, which has instigated a lot of research in this area. Data preprocessing using sampling techniques is a…

Machine Learning · Computer Science 2022-08-23 Asif Newaz , Farhan Shahriyar Haq

Infinite random forests for imbalanced classification tasks

We study predictive probability inference in classification tasks using random forests under class imbalance. We focus on two simplified variants of Breiman's algorithm, namely subsampling Infinite Random Forests (IRFs) and under-sampling…

Statistics Theory · Mathematics 2025-05-23 Moria Mayala , Olivier Wintenberger , Charles Tillier , Clément Dombry

Random Forest Variable Importance-based Selection Algorithm in Class Imbalance Problem

Random Forest is a machine learning method that offers many advantages, including the ability to easily measure variable importance. Class balancing technique is a well-known solution to deal with class imbalance problem. However, it has…

Machine Learning · Statistics 2023-12-19 Yunbi Nam , Sunwoo Han

Crossbreeding in Random Forest

Ensemble learning methods are designed to benefit from multiple learning algorithms for better predictive performance. The tradeoff of this improved performance is slower speed and larger size of ensemble learning systems compared to single…

Machine Learning · Computer Science 2021-01-22 Abolfazl Nadi , Hadi Moradi , Khalil Taheri

hi-RF: Incremental Learning Random Forest for large-scale multi-class Data Classification

In recent years, dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research. Most traditional methods struggle to balance the precision and computational burden when…

Machine Learning · Computer Science 2016-11-01 Tingting Xie , Yuxing Peng , Changjian Wang

Improving fraud prediction with incremental data balancing technique for massive data streams

The performance of classification algorithms with a massive and highly imbalanced data stream depends upon efficient balancing strategy. Some techniques of balancing strategy have been applied in the past with Batch data to resolve the…

Machine Learning · Computer Science 2019-10-22 Rafiq Ahmed Mohammed , Kok-Wai Wong , Mohd Fairuz Shiratuddin , Xuequn Wang

Classification Imbalance as Transfer Learning

Classification imbalance arises when one class is much rarer than the other. We frame this setting as transfer learning under label (prior) shift between an imbalanced source distribution induced by the observed data and a balanced target…

Machine Learning · Statistics 2026-01-16 Eric Xia , Jason M. Klusowski

Improved Weighted Random Forest for Classification Problems

Several studies have shown that combining machine learning models in an appropriate way will introduce improvements in the individual predictions made by the base models. The key to make well-performing ensemble model is in the diversity of…

Machine Learning · Computer Science 2021-03-01 Mohsen Shahhosseini , Guiping Hu

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority…

Machine Learning · Computer Science 2021-03-30 Ayush Tripathi , Rupayan Chakraborty , Sunil Kumar Kopparapu

Heterogeneous Random Forest

Random forest (RF) stands out as a highly favored machine learning approach for classification problems. The effectiveness of RF hinges on two key factors: the accuracy of individual trees and the diversity among them. In this study, we…

Machine Learning · Computer Science 2024-10-28 Ye-eun Kim , Seoung Yun Kim , Hyunjoong Kim

Balanced Random Survival Forests for Extremely Unbalanced, Right Censored Data

Accuracies of survival models for life expectancy prediction as well as critical-care applications are significantly compromised due to the sparsity of samples and extreme imbalance between the survival (usually, the majority) and mortality…

Machine Learning · Statistics 2018-04-16 Kahkashan Afrin , Gurudev Illangovan , Sanjay S. Srivatsa , Satish T. S. Bukkapatnam

Random Forest Calibration

The Random Forest (RF) classifier is often claimed to be relatively well calibrated when compared with other machine learning methods. Moreover, the existing literature suggests that traditional calibration methods, such as isotonic…

Machine Learning · Computer Science 2025-01-29 Mohammad Hossein Shaker , Eyke Hüllermeier

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by…

Machine Learning · Computer Science 2022-08-26 Asif Newaz , Shahriar Hassan , Farhan Shahriyar Haq

Automated Imbalanced Classification via Layered Learning

In this paper we address imbalanced binary classification (IBC) tasks. Applying resampling strategies to balance the class distribution of training instances is a common approach to tackle these problems. Many state-of-the-art methods find…

Machine Learning · Computer Science 2022-05-31 Vitor Cerqueira , Luis Torgo , Paula Branco , Colin Bellinger

Gamma distribution-based sampling for imbalanced data

Imbalanced class distribution is a common problem in a number of fields including medical diagnostics, fraud detection, and others. It causes bias in classification algorithms leading to poor performance on the minority class data. In this…

Machine Learning · Computer Science 2020-09-23 Firuz Kamalov , Dmitry Denisov

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling…

Machine Learning · Computer Science 2024-10-01 Mohamed Abdelhamid , Abhyuday Desai

A Bilevel Optimization Framework for Imbalanced Data Classification

Data rebalancing techniques, including oversampling and undersampling, are a common approach to addressing the challenges of imbalanced data. To tackle unresolved problems related to both oversampling and undersampling, we propose a new…

Machine Learning · Computer Science 2025-07-11 Karen Medlin , Sven Leyffer , Krishnan Raghavan

Harnessing Mixed Features for Imbalance Data Oversampling: Application to Bank Customers Scoring

This study investigates rare event detection on tabular data within binary classification. Standard techniques to handle class imbalance include SMOTE, which generates synthetic samples from the minority class. However, SMOTE is…

Machine Learning · Computer Science 2025-04-01 Abdoulaye Sakho , Emmanuel Malherbe , Carl-Erik Gauthier , Erwan Scornet

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Tree Boosting Methods for Balanced andImbalanced Classification and their Robustness Over Time in Risk Assessment

Most real-world classification problems deal with imbalanced datasets, posing a challenge for Artificial Intelligence (AI), i.e., machine learning algorithms, because the minority class, which is of extreme interest, often proves difficult…

Machine Learning · Computer Science 2025-04-28 Gissel Velarde , Michael Weichert , Anuj Deshmunkh , Sanjay Deshmane , Anindya Sudhir , Khushboo Sharma , Vaibhav Joshi