Related papers: Improved Weighted Random Forest for Classification…

Binary Classification: Is Boosting stronger than Bagging?

Random Forests have been one of the most popular bagging methods in the past few decades, especially due to their success at handling tabular datasets. They have been extensively studied and compared to boosting models, like XGBoost, which…

Machine Learning · Computer Science 2024-10-28 Dimitris Bertsimas , Vasiliki Stoumpou

Evolutionary bagging for ensemble learning

Ensemble learning has gained success in machine learning with major advantages over other learning methods. Bagging is a prominent ensemble learning method that creates subgroups of data, known as bags, that are trained by individual…

Neural and Evolutionary Computing · Computer Science 2022-09-07 Giang Ngo , Rodney Beard , Rohitash Chandra

Optimal Weighted Random Forests

The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions.…

Machine Learning · Statistics 2023-05-18 Xinyu Chen , Dalei Yu , Xinyu Zhang

Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem

Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest…

Econometrics · Economics 2020-12-22 Mochen Yang , Edward McFowland , Gordon Burtch , Gediminas Adomavicius

An Approximation Method for Fitted Random Forests

Random Forests (RF) is a popular machine learning method for classification and regression problems. It involves a bagging application to decision tree models. One of the primary advantages of the Random Forests model is the reduction in…

Machine Learning · Statistics 2022-07-06 Sai K Popuri

Lassoed Forests: Random Forests with Adaptive Lasso Post-selection

Random forests are a statistical learning technique that use bootstrap aggregation to average high-variance and low-bias trees. Improvements to random forests, such as applying Lasso regression to the tree predictions, have been proposed in…

Machine Learning · Statistics 2025-11-13 Jing Shang , James Bannon , Benjamin Haibe-Kains , Robert Tibshirani

A weighted random survival forest

A weighted random survival forest is presented in the paper. It can be regarded as a modification of the random forest improving its performance. The main idea underlying the proposed model is to replace the standard procedure of averaging…

Machine Learning · Statistics 2019-01-03 Lev V. Utkin , Andrei V. Konstantinov , Viacheslav S. Chukanov , Mikhail V. Kots , Mikhail A. Ryabinin , Anna A. Meldo

Cross-Cluster Weighted Forests

Adapting machine learning algorithms to better handle the presence of clusters or batch effects within training datasets is important across a wide variety of biological applications. This article considers the effect of ensembling Random…

Machine Learning · Statistics 2025-04-01 Maya Ramchandran , Rajarshi Mukherjee , Giovanni Parmigiani

The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data

Data analysis and machine learning have become an integrative part of the modern scientific methodology, providing automated techniques to predict further information based on observations. One of these classification and regression…

Computer Vision and Pattern Recognition · Computer Science 2019-01-07 Mario Amrehn , Firas Mualla , Elli Angelopoulou , Stefan Steidl , Andreas Maier

Large Random Forests: Optimisation for Rapid Evaluation

Random Forests are one of the most popular classifiers in machine learning. The larger they are, the more precise is the outcome of their predictions. However, this comes at a cost: their running time for classification grows linearly with…

Machine Learning · Computer Science 2019-12-24 Frederik Gossen , Bernhard Steffen

Exploiting random projections and sparsity with random forests and gradient boosting methods -- Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested…

Machine Learning · Statistics 2019-05-20 Arnaud Joly

An Algorithmic Framework for Constructing Multiple Decision Trees by Evaluating Their Combination Performance Throughout the Construction Process

Predictions using a combination of decision trees are known to be effective in machine learning. Typical ideas for constructing a combination of decision trees for prediction are bagging and boosting. Bagging independently constructs…

Machine Learning · Computer Science 2024-02-12 Keito Tajima , Naoki Ichijo , Yuta Nakahara , Toshiyasu Matsushima

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learning techniques for prediction problems. In this article, we discuss some cases where random forests may suffer and propose a novel generalized RF method, namely…

Machine Learning · Statistics 2019-04-24 Haozhe Zhang , Dan Nettleton , Zhengyuan Zhu

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data,…

Machine Learning · Statistics 2022-10-13 Domagoj Ćevid , Loris Michel , Jeffrey Näf , Nicolai Meinshausen , Peter Bühlmann

Adaptive Forests For Classification

Random Forests (RF) and Extreme Gradient Boosting (XGBoost) are two of the most widely used and highly performing classification and regression models. They aggregate equally weighted CART trees, generated randomly in RF or sequentially in…

Machine Learning · Computer Science 2025-10-28 Dimitris Bertsimas , Yubing Cui

Oblique and rotation double random forest

Random Forest is an ensemble of decision trees based on the bagging and random subspace concepts. As suggested by Breiman, the strength of unstable learners and the diversity among them are the ensemble models' core strength. In this paper,…

Machine Learning · Computer Science 2022-08-11 M. A. Ganaie , M. Tanveer , P. N. Suganthan , V. Snasel

Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance

As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications. Tools like…

Machine Learning · Statistics 2020-11-10 Lucas Mentch , Siyu Zhou

Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Aggregating multiple learners through an ensemble of models aim to make better predictions by capturing the underlying distribution of the data more accurately. Different ensembling methods, such as bagging, boosting, and stacking/blending,…

Machine Learning · Statistics 2020-11-03 Mohsen Shahhosseini , Guiping Hu , Hieu Pham

Crossbreeding in Random Forest

Ensemble learning methods are designed to benefit from multiple learning algorithms for better predictive performance. The tradeoff of this improved performance is slower speed and larger size of ensemble learning systems compared to single…

Machine Learning · Computer Science 2021-01-22 Abolfazl Nadi , Hadi Moradi , Khalil Taheri

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates…

Machine Learning · Statistics 2019-05-28 Hanyuan Hang , Xiaoyu Liu , Ingo Steinwart