Related papers: Generative Forests

NRGBoost: Energy-Based Generative Boosted Trees

Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular…

Machine Learning · Computer Science 2025-04-21 João Bravo

Generative Trees: Adversarial and Copycat

While Generative Adversarial Networks (GANs) achieve spectacular results on unstructured data like images, there is still a gap on tabular data, data for which state of the art supervised learning still favours to a large extent decision…

Machine Learning · Computer Science 2022-02-14 Richard Nock , Mathieu Guillame-Bert

Generative modeling of density regression through tree flows

A common objective in the analysis of tabular data is estimating the conditional distribution (in contrast to only producing predictions) of a set of "outcome" variables given a set of "covariates", which is sometimes referred to as the…

Machine Learning · Statistics 2024-10-08 Zhuoqun Wang , Naoki Awaya , Li Ma

BUFF: Boosted Decision Tree based Ultra-Fast Flow matching

Tabular data stands out as one of the most frequently encountered types in high energy physics. Unlike commonly homogeneous data such as pixelated images, simulating high-dimensional tabular data and accurately capturing their correlations…

Instrumentation and Detectors · Physics 2024-04-30 Cheng Jiang , Sitian Qian , Huilin Qu

Towards Robust Classification with Deep Generative Forests

Decision Trees and Random Forests are among the most widely used machine learning models, and often achieve state-of-the-art performance in tabular, domain-agnostic datasets. Nonetheless, being primarily discriminative models they lack…

Machine Learning · Statistics 2020-07-14 Alvaro H. C. Correia , Robert Peharz , Cassio de Campos

Adversarial random forests for density estimation and generative modeling

We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural…

Machine Learning · Statistics 2023-03-14 David S. Watson , Kristin Blesch , Jan Kapar , Marvin N. Wright

A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective

Tabular data is one of the most widely used data formats across various domains such as bioinformatics, healthcare, and marketing. As artificial intelligence moves towards a data-centric perspective, improving data quality is essential for…

Machine Learning · Computer Science 2025-02-18 Wangyang Ying , Cong Wei , Nanxu Gong , Xinyuan Wang , Haoyue Bai , Arun Vignesh Malarkkan , Sixun Dong , Dongjie Wang , Denghui Zhang , Yanjie Fu

Exploiting random projections and sparsity with random forests and gradient boosting methods -- Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested…

Machine Learning · Statistics 2019-05-20 Arnaud Joly

Statistical Inference via Generative Models: Flow Matching and Causal Inference

Generative AI has achieved remarkable empirical success, but from the perspective of statistics it often remains opaque: its predictions may be accurate, yet the underlying mechanism is difficult to interpret, analyze, and trust. This book…

Machine Learning · Statistics 2026-03-11 Shinto Eguchi

Reinforced Decision Trees

In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction…

Machine Learning · Computer Science 2015-11-26 Aurélia Léon , Ludovic Denoyer

Learning Decision Trees as Amortized Structure Inference

Building predictive models for tabular data presents fundamental challenges, notably in scaling consistently, i.e., more resources translating to better performance, and generalizing systematically beyond the training data distribution.…

Machine Learning · Computer Science 2025-03-11 Mohammed Mahfoud , Ghait Boukachab , Michał Koziarski , Alex Hernandez-Garcia , Stefan Bauer , Yoshua Bengio , Nikolay Malkin

Deep differentiable forest with sparse attention for the tabular data

We present a general architecture of deep differentiable forest and its sparse attention mechanism. The differentiable forest has the advantages of both trees and neural networks. Its structure is a simple binary tree, easy to use and…

Machine Learning · Computer Science 2020-03-03 Yingshi Chen

Neural Random Forest Imitation

We present Neural Random Forest Imitation - a novel approach for transforming random forests into neural networks. Existing methods propose a direct mapping and produce very inefficient architectures. In this work, we introduce an imitation…

Machine Learning · Computer Science 2024-04-05 Christoph Reinders , Bodo Rosenhahn

Boosting gets full Attention for Relational Learning

More often than not in benchmark supervised ML, tabular data is flat, i.e. consists of a single $m \times d$ (rows, columns) file, but cases abound in the real world where observations are described by a set of tables with structural…

Machine Learning · Computer Science 2024-02-26 Mathieu Guillame-Bert , Richard Nock

Generative Adversarial Forests for Better Conditioned Adversarial Learning

In recent times, many of the breakthroughs in various vision-related tasks have revolved around improving learning of deep models; these methods have ranged from network architectural improvements such as Residual Networks, to various forms…

Machine Learning · Statistics 2018-05-15 Yan Zuo , Gil Avraham , Tom Drummond

A Closer Look at Deep Learning Methods on Tabular Datasets

Tabular data is prevalent across diverse domains in machine learning. With the rapid progress of deep tabular prediction methods, especially pretrained (foundation) models, there is a growing need to evaluate these methods systematically…

Machine Learning · Computer Science 2025-11-10 Han-Jia Ye , Si-Yang Liu , Hao-Run Cai , Qi-Le Zhou , De-Chuan Zhan

Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees

Tabular data is hard to acquire and is subject to missing values. This paper introduces a novel approach for generating and imputing mixed-type (continuous and categorical) tabular data utilizing score-based diffusion and conditional flow…

Machine Learning · Computer Science 2024-02-21 Alexia Jolicoeur-Martineau , Kilian Fatras , Tal Kachman

Towards Data-Centric AI: A Comprehensive Survey of Traditional, Reinforcement, and Generative Approaches for Tabular Data Transformation

Tabular data is one of the most widely used formats across industries, driving critical applications in areas such as finance, healthcare, and marketing. In the era of data-centric AI, improving data quality and representation has become…

Machine Learning · Computer Science 2025-01-22 Dongjie Wang , Yanyong Huang , Wangyang Ying , Haoyue Bai , Nanxu Gong , Xinyuan Wang , Sixun Dong , Tao Zhe , Kunpeng Liu , Meng Xiao , Pengfei Wang , Pengyang Wang , Hui Xiong , Yanjie Fu

Dynamic Trees for Learning and Design

Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the…

Methodology · Statistics 2010-11-23 Matthew A. Taddy , Robert B. Gramacy , Nicholas G. Polson

Reinforcement Learning for Generative AI: A Survey

Deep Generative AI has been a long-standing essential topic in the machine learning community, which can impact a number of application areas like text generation and computer vision. The major paradigm to train a generative model is…

Machine Learning · Computer Science 2025-02-25 Yuanjiang Cao , Quan Z. Sheng , Julian McAuley , Lina Yao