Related papers: Random Forest Calibration

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learning techniques for prediction problems. In this article, we discuss some cases where random forests may suffer and propose a novel generalized RF method, namely…

Machine Learning · Statistics 2019-04-24 Haozhe Zhang , Dan Nettleton , Zhengyuan Zhu

An Approximation Method for Fitted Random Forests

Random Forests (RF) is a popular machine learning method for classification and regression problems. It involves a bagging application to decision tree models. One of the primary advantages of the Random Forests model is the reduction in…

Machine Learning · Statistics 2022-07-06 Sai K Popuri

Optimal Weighted Random Forests

The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions.…

Machine Learning · Statistics 2023-05-18 Xinyu Chen , Dalei Yu , Xinyu Zhang

Random Forest Variable Importance-based Selection Algorithm in Class Imbalance Problem

Random Forest is a machine learning method that offers many advantages, including the ability to easily measure variable importance. Class balancing technique is a well-known solution to deal with class imbalance problem. However, it has…

Machine Learning · Statistics 2023-12-19 Yunbi Nam , Sunwoo Han

Heterogeneous Random Forest

Random forest (RF) stands out as a highly favored machine learning approach for classification problems. The effectiveness of RF hinges on two key factors: the accuracy of individual trees and the diversity among them. In this study, we…

Machine Learning · Computer Science 2024-10-28 Ye-eun Kim , Seoung Yun Kim , Hyunjoong Kim

Random Forest Missing Data Algorithms

Random forest (RF) missing data algorithms are an attractive approach for dealing with missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity,…

Machine Learning · Statistics 2017-01-23 Fei Tang , Hemant Ishwaran

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Better Classifier Calibration for Small Data Sets

Classifier calibration does not always go hand in hand with the classifier's ability to separate the classes. There are applications where good classifier calibration, i.e. the ability to produce accurate probability estimates, is more…

Machine Learning · Computer Science 2020-05-26 Tuomo Alasalmi , Jaakko Suutala , Heli Koskimäki , Juha Röning

Probability Calibration Trees

Obtaining accurate and well calibrated probability estimates from classifiers is useful in many applications, for example, when minimising the expected cost of classifications. Existing methods of calibrating probability estimates are…

Machine Learning · Computer Science 2018-09-17 Tim Leathart , Eibe Frank , Geoffrey Holmes , Bernhard Pfahringer

Targeting predictors in random forest regression

Random forest regression (RF) is an extremely popular tool for the analysis of high-dimensional data. Nonetheless, its benefits may be lessened in sparse settings due to weak predictors, and a pre-estimation dimension reduction (targeting)…

Econometrics · Economics 2020-11-09 Daniel Borup , Bent Jesper Christensen , Nicolaj Nørgaard Mühlbach , Mikkel Slot Nielsen

Crossbreeding in Random Forest

Ensemble learning methods are designed to benefit from multiple learning algorithms for better predictive performance. The tradeoff of this improved performance is slower speed and larger size of ensemble learning systems compared to single…

Machine Learning · Computer Science 2021-01-22 Abolfazl Nadi , Hadi Moradi , Khalil Taheri

Confidence and Uncertainty Assessment for Distributional Random Forests

The Distributional Random Forest (DRF) is a recently introduced Random Forest algorithm to estimate multivariate conditional distributions. Due to its general estimation procedure, it can be employed to estimate a wide range of targets such…

Statistics Theory · Mathematics 2023-12-20 Jeffrey Näf , Corinne Emmenegger , Peter Bühlmann , Nicolai Meinshausen

hi-RF: Incremental Learning Random Forest for large-scale multi-class Data Classification

In recent years, dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research. Most traditional methods struggle to balance the precision and computational burden when…

Machine Learning · Computer Science 2016-11-01 Tingting Xie , Yuxing Peng , Changjian Wang

A Large-Scale Study of Probabilistic Calibration in Neural Network Regression

Accurate probabilistic predictions are essential for optimal decision making. While neural network miscalibration has been studied primarily in classification, we investigate this in the less-explored domain of regression. We conduct the…

Machine Learning · Computer Science 2023-06-08 Victor Dheur , Souhaib Ben Taieb

One Class Splitting Criteria for Random Forests

Random Forests (RFs) are strong machine learning tools for classification and regression. However, they remain supervised algorithms, and no extension of RFs to the one-class setting has been proposed, except for techniques based on…

Machine Learning · Statistics 2016-11-22 Nicolas Goix , Nicolas Drougard , Romain Brault , Maël Chiapino

Probabilistic Scores of Classifiers, Calibration is not Enough

In binary classification tasks, accurate representation of probabilistic predictions is essential for various real-world applications such as predicting payment defaults or assessing medical risks. The model must then be well-calibrated to…

Machine Learning · Computer Science 2024-08-08 Agathe Fernandes Machado , Arthur Charpentier , Emmanuel Flachaire , Ewen Gallic , François Hu

On Extreme Pruning of Random Forest Ensembles for Real-time Predictive Applications

Random Forest (RF) is an ensemble supervised machine learning technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe…

Machine Learning · Computer Science 2015-03-18 Khaled Fawagreh , Mohamad Medhat Gaber , Eyad Elyan

Hyperparameters and Tuning Strategies for Random Forest

The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables…

Machine Learning · Statistics 2019-02-27 Philipp Probst , Marvin Wright , Anne-Laure Boulesteix

An Outlier Detection-based Tree Selection Approach to Extreme Pruning of Random Forests

Random Forest (RF) is an ensemble classification technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there…

Machine Learning · Computer Science 2015-03-19 Khaled Fawagreh , Mohamad Medhat Gaber , Eyad Elyan

Reassessing How to Compare and Improve the Calibration of Machine Learning Models

A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. This property has become increasingly important as the impact of machine…

Machine Learning · Computer Science 2025-02-25 Muthu Chidambaram , Rong Ge