English
Related papers

Related papers: JoinBoost: Grow Trees Over Normalized Data Using O…

200 papers

Tree ensembles such as XGBoost are often preferred for discriminative tasks in mixed-type tabular data, due to their inductive biases, minimal hyperparameter tuning, and training efficiency. We argue that these qualities, when leveraged…

Machine Learning · Computer Science 2026-03-10 Jim Achterberg , Marcel Haas , Bram van Dijk , Marco Spruit

Most real-world classification problems deal with imbalanced datasets, posing a challenge for Artificial Intelligence (AI), i.e., machine learning algorithms, because the minority class, which is of extreme interest, often proves difficult…

Machine Learning · Computer Science 2025-04-28 Gissel Velarde , Michael Weichert , Anuj Deshmunkh , Sanjay Deshmane , Anindya Sudhir , Khushboo Sharma , Vaibhav Joshi

Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees…

Machine Learning · Computer Science 2026-05-12 Yi-Siang Wang , Kuan-Yu Chen , Yu-Chen Den , Darby Tien-Hao Chang

XGBoost, a scalable tree boosting algorithm, has proven effective for many prediction tasks of practical interest, especially using tabular datasets. Hyperparameter tuning can further improve the predictive performance, but unlike neural…

Machine Learning · Computer Science 2021-11-16 Sanyam Kapoor , Valerio Perrone

Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular…

Machine Learning · Computer Science 2025-04-21 João Bravo

Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results…

Machine Learning · Computer Science 2016-06-14 Tianqi Chen , Carlos Guestrin

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set…

Machine Learning · Computer Science 2019-10-29 Julaiti Alafate , Yoav Freund

Accurate demand forecasting is critical for brick-and-mortar retailers to optimize inventory management and minimize costs. This study evaluates statistical baselines, tree-based ensembles (XGBoost and LightGBM), and deep learning…

Machine Learning · Computer Science 2026-03-12 Luka Hobor , Mario Brcic , Lidija Polutnik , Ante Kapetanovic

Boosting is a method for learning a single accurate predictor by linearly combining a set of less accurate weak learners. Recently, structured learning has found many applications in computer vision. Inspired by structured support vector…

Machine Learning · Computer Science 2020-03-10 Chunhua Shen , Guosheng Lin , Anton van den Hengel

A key element in solving real-life data science problems is selecting the types of models to use. Tree ensemble models (such as XGBoost) are usually recommended for classification and regression problems with tabular data. However, several…

Machine Learning · Computer Science 2021-11-24 Ravid Shwartz-Ziv , Amitai Armon

Machine Learning focuses on the construction and study of systems that can learn from data. This is connected with the classification problem, which usually is what Machine Learning algorithms are designed to solve. When a machine learning…

Machine Learning · Statistics 2018-02-13 Kyongche Kang , Jack Michalak

Stochastic Gradient TreeBoost is often found in many winning solutions in public data science challenges. Unfortunately, the best performance requires extensive parameter tuning and can be prone to overfitting. We propose PaloBoost, a…

Machine Learning · Statistics 2018-07-24 Yubin Park , Joyce C. Ho

Ensemble learning of LLMs has emerged as a promising alternative to enhance performance, but existing approaches typically treat models as black boxes, combining the inputs or final outputs while overlooking the rich internal…

Boosting is a popular algorithm in supervised machine learning with wide applications in regression and classification problems. It combines weak learners, such as regression trees, to obtain accurate predictions. However, in the presence…

Computation · Statistics 2025-02-06 Zhu Wang

More often than not in benchmark supervised ML, tabular data is flat, i.e. consists of a single $m \times d$ (rows, columns) file, but cases abound in the real world where observations are described by a set of tables with structural…

Machine Learning · Computer Science 2024-02-26 Mathieu Guillame-Bert , Richard Nock

Federated Learning (FL) has lately gained traction as it addresses how machine learning models train on distributed datasets. FL was designed for parametric models, namely Deep Neural Networks (DNNs).Thus, it has shown promise on image and…

Machine Learning · Computer Science 2024-05-06 William Lindskog , Christian Prehofer

Traditional gradient boosting algorithms employ static tree structures with fixed splitting criteria that remain unchanged throughout training, limiting their ability to adapt to evolving gradient distributions and problem-specific…

Machine Learning · Computer Science 2025-11-18 Boris Kriuk

Random Forests have been one of the most popular bagging methods in the past few decades, especially due to their success at handling tabular datasets. They have been extensively studied and compared to boosting models, like XGBoost, which…

Machine Learning · Computer Science 2024-10-28 Dimitris Bertsimas , Vasiliki Stoumpou

Large language models (LLMs) perform remarkably well on tabular datasets in zero- and few-shot settings, since they can extract meaning from natural language column headers that describe features and labels. Similarly, TabPFN, a recent…

XGBoost is a scalable ensemble technique based on gradient boosting that has demonstrated to be a reliable and efficient machine learning challenge solver. This work proposes a practical analysis of how this novel technique works in terms…

Machine Learning · Computer Science 2023-05-05 Candice Bentéjac , Anna Csörgő , Gonzalo Martínez-Muñoz
‹ Prev 1 2 3 10 Next ›