Related papers: Probabilistic Random Forest: A machine learning al…

The Probabilistic Random Forest applied to the selection of quasar candidates in the QUBRICS Survey

The number of known, bright ($i<18$), high-redshift ($z>2.5$) QSOs in the Southern Hemisphere is considerably lower than the corresponding number in the Northern Hemisphere due to the lack of multi-wavelength surveys at $\delta<0$. Recent…

Instrumentation and Methods for Astrophysics · Physics 2021-07-14 Francesco Guarneri , Giorgio Calderone , Stefano Cristiani , Fabio Fontanot , Konstantina Boutsia , Guido Cupani , Andrea Grazian , Valentina D'Odorico

Heterogeneous Random Forest

Random forest (RF) stands out as a highly favored machine learning approach for classification problems. The effectiveness of RF hinges on two key factors: the accuracy of individual trees and the diversity among them. In this study, we…

Machine Learning · Computer Science 2024-10-28 Ye-eun Kim , Seoung Yun Kim , Hyunjoong Kim

Machine learning Applied to Star-Galaxy-QSO Classification and Stellar Effective Temperature Regression

In modern astrophysics, the machine learning has increasingly gained more popularity with its incredibly powerful ability to make predictions or calculated suggestions for large amounts of data. We describe an application of the supervised…

Astrophysics of Galaxies · Physics 2018-12-26 Yu Bai , JiFeng Liu , Song Wang , Fan Yang

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learning techniques for prediction problems. In this article, we discuss some cases where random forests may suffer and propose a novel generalized RF method, namely…

Machine Learning · Statistics 2019-04-24 Haozhe Zhang , Dan Nettleton , Zhengyuan Zhu

Nonparametric Feature Selection by Random Forests and Deep Neural Networks

Random forests are a widely used machine learning algorithm, but their computational efficiency is undermined when applied to large-scale datasets with numerous instances and useless features. Herein, we propose a nonparametric feature…

Machine Learning · Computer Science 2022-01-19 Xiaojun Mao , Liuhua Peng , Zhonglei Wang

Random Similarity Forests

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Probabilistic Machine Learning for Noisy Labels in Earth Observation

Label noise poses a significant challenge in Earth Observation (EO), often degrading the performance and reliability of supervised Machine Learning (ML) models. Yet, given the critical nature of several EO applications, developing robust…

Machine Learning · Computer Science 2025-04-07 Spyros Kondylatos , Nikolaos Ioannis Bountos , Ioannis Prapas , Angelos Zavras , Gustau Camps-Valls , Ioannis Papoutsis

Random Forest Calibration

The Random Forest (RF) classifier is often claimed to be relatively well calibrated when compared with other machine learning methods. Moreover, the existing literature suggests that traditional calibration methods, such as isotonic…

Machine Learning · Computer Science 2025-01-29 Mohammad Hossein Shaker , Eyke Hüllermeier

Diversity Conscious Refined Random Forest

Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and…

Machine Learning · Computer Science 2025-07-08 Sijan Bhattarai , Saurav Bhandari , Girija Bhusal , Saroj Shakya , Tapendra Pandey

Hyperparameters and Tuning Strategies for Random Forest

The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables…

Machine Learning · Statistics 2019-02-27 Philipp Probst , Marvin Wright , Anne-Laure Boulesteix

Noise-Resistant Deep Metric Learning with Probabilistic Instance Filtering

Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks. Cleaning data manually is labour-intensive and time-consuming. Previous research mostly focuses on enhancing classification…

Machine Learning · Computer Science 2021-12-20 Chang Liu , Han Yu , Boyang Li , Zhiqi Shen , Zhanning Gao , Peiran Ren , Xuansong Xie , Lizhen Cui , Chunyan Miao

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement

Random Forests (RF) are among the state-of-the-art in many machine learning applications. With the ongoing integration of ML models into everyday life, the deployment and continuous application of models becomes more and more an important…

Machine Learning · Computer Science 2021-10-20 Sebastian Buschjäger , Katharina Morik

Random Forest Missing Data Algorithms

Random forest (RF) missing data algorithms are an attractive approach for dealing with missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity,…

Machine Learning · Statistics 2017-01-23 Fei Tang , Hemant Ishwaran

On Machine-Learned Classification of Variable Stars with Sparse and Noisy Time-Series Data

With the coming data deluge from synoptic surveys, there is a growing need for frameworks that can quickly and automatically produce calibrated classification probabilities for newly-observed variables based on a small number of time-series…

Instrumentation and Methods for Astrophysics · Physics 2015-03-17 Joseph W. Richards , Dan L. Starr , Nathaniel R. Butler , Joshua S. Bloom , John M. Brewer , Arien Crellin-Quick , Justin Higgins , Rachel Kennedy , Maxime Rischard

Random Forest Variable Importance-based Selection Algorithm in Class Imbalance Problem

Random Forest is a machine learning method that offers many advantages, including the ability to easily measure variable importance. Class balancing technique is a well-known solution to deal with class imbalance problem. However, it has…

Machine Learning · Statistics 2023-12-19 Yunbi Nam , Sunwoo Han

Boosting SISSO Performance on Small Sample Datasets by Using Random Forests Prescreening for Complex Feature Selection

In materials science, data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates. Symbolic regression is a key to extracting material descriptors from large datasets, in particular…

Machine Learning · Computer Science 2024-10-01 Xiaolin Jiang , Guanqi Liu , Jiaying Xie , Zhenpeng Hu

Approximate False Positive Rate Control in Selection Frequency for Random Forest

Random Forest has become one of the most popular tools for feature selection. Its ability to deal with high-dimensional data makes this algorithm especially useful for studies in neuroimaging and bioinformatics. Despite its popularity and…

Machine Learning · Computer Science 2014-10-13 Ender Konukoglu , Melanie Ganz

An Approximation Method for Fitted Random Forests

Random Forests (RF) is a popular machine learning method for classification and regression problems. It involves a bagging application to decision tree models. One of the primary advantages of the Random Forests model is the reduction in…

Machine Learning · Statistics 2022-07-06 Sai K Popuri

Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success

Predicting rare outcomes such as startup success is central to venture capital, demanding models that are both accurate and interpretable. We introduce Random Rule Forest (RRF), a lightweight ensemble method that uses a large language model…

Artificial Intelligence · Computer Science 2025-09-17 Ben Griffin , Diego Vidaurre , Ugur Koyluoglu , Joseph Ternasky , Fuat Alican , Yigit Ihlamur

Towards Robust Classification with Deep Generative Forests

Decision Trees and Random Forests are among the most widely used machine learning models, and often achieve state-of-the-art performance in tabular, domain-agnostic datasets. Nonetheless, being primarily discriminative models they lack…

Machine Learning · Statistics 2020-07-14 Alvaro H. C. Correia , Robert Peharz , Cassio de Campos