Related papers: Controlling the False Split Rate in Tree-Based Agg…

Consensus Tree Estimation with False Discovery Rate Control via Partially Ordered Sets

Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and…

Methodology · Statistics 2025-12-01 Maria Alejandra Valdez Cabrera , Amy D Willis , Armeen Taeb

Tree-aggregated regression for compositional data with measurement errors

High-dimensional compositional covariates, often derived from count data, are subject to measurement error and are frequently analyzed after aggregation along a prespecified tree to improve interpretability in applications such as…

Methodology · Statistics 2026-05-18 Zhenghan Li , Tianying Wang

Linear Aggregation in Tree-based Estimators

Regression trees and their ensemble methods are popular methods for nonparametric regression: they combine strong predictive performance with interpretable estimators. To improve their utility for locally smooth response surfaces, we study…

Methodology · Statistics 2021-09-13 Sören R. Künzel , Theo F. Saarinen , Edward W. Liu , Jasjeet S. Sekhon

Measure Inducing Classification and Regression Trees for Functional Data

We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing…

Machine Learning · Statistics 2020-11-03 Edoardo Belli , Simone Vantini

When does Subagging Work?

We study the effectiveness of subagging, or subsample aggregating, on regression trees, a popular non-parametric method in machine learning. First, we give sufficient conditions for pointwise consistency of trees. We formalize that (i) the…

Machine Learning · Statistics 2024-04-03 Christos Revelas , Otilia Boldea , Bas J. M. Werker

Tree-based methods for estimating heterogeneous model performance and model combining

Model performance is frequently reported only for the overall population under consideration. However, due to heterogeneity, overall performance measures often do not accurately represent model performance within specific subgroups. We…

Methodology · Statistics 2025-06-03 Ruotao Zhang , Constantine Gatsonis , Jon Steingrimsson

Trees-Based Models for Correlated Data

This paper presents a new approach for trees-based regression, such as simple regression tree, random forest and gradient boosting, in settings involving correlated data. We show the problems that arise when implementing standard…

Methodology · Statistics 2021-08-09 Assaf Rabinowicz , Saharon Rosset

Big Data Classification Using Augmented Decision Trees

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data…

Methodology · Statistics 2022-10-19 Anna C. Neufeld , Lucy L. Gao , Daniela M. Witten

Equal Splits of Vertex-Weighted Trees

Given a tree of weighted vertices, it is sometimes possible to break the tree into two equally-weighted subtrees within an allowable error. We give a fast algorithm that finds an edge which breaks the tree into equal-weight components or…

Combinatorics · Mathematics 2020-11-13 Corinne Mulvey

A Bottom-up Approach to Testing Hypotheses That Have a Branching Tree Dependence Structure, with False Discovery Rate Control

Modern statistical analyses often involve testing large numbers of hypotheses. In many situations, these hypotheses may have an underlying tree structure that not only helps determine the order that tests should be conducted but also…

Methodology · Statistics 2019-03-19 Yunxiao Li , Yi-Juan Hu , Glen A. Satten

Decision Stream: Cultivating Deep Decision Trees

Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at…

Machine Learning · Computer Science 2017-09-05 Dmitry Ignatov , Andrey Ignatov

A Polynomial Algorithm for Balanced Clustering via Graph Partitioning

The objective of clustering is to discover natural groups in datasets and to identify geometrical structures which might reside there, without assuming any prior knowledge on the characteristics of the data. The problem can be seen as…

Computational Geometry · Computer Science 2018-01-26 Luis-Evaristo Caraballo , José-Miguel Díaz-Báñez , Nadine Kroher

Clustering multivariate functional data using unsupervised binary trees

We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random,…

Machine Learning · Statistics 2022-03-14 Steven Golovkine , Nicolas Klutchnikoff , Valentin Patilea

Distributed Inference in Tree Networks using Coding Theory

In this paper, we consider the problem of distributed inference in tree based networks. In the framework considered in this paper, distributed nodes make a 1-bit local decision regarding a phenomenon before sending it to the fusion center…

Information Theory · Computer Science 2016-11-17 Bhavya Kailkhura , Aditya Vempaty , Pramod K. Varshney

Random Planted Forest: a directly interpretable tree ensemble

We introduce a novel interpretable tree based algorithm for prediction in a regression setting. Our motivation is to estimate the unknown regression function from a functional decomposition perspective in which the functional components…

Machine Learning · Statistics 2023-08-04 Munir Hiabu , Enno Mammen , Joseph T. Meyer

Unifying Tree-Reweighted Belief Propagation and Mean Field for Tracking Extended Targets

This paper proposes a unified tree-reweighted belief propagation (BP) and mean field (MF) approach for scalable detection and tracking of extended targets within the framework of factor graph. The factor graph is partitioned into a BP…

Signal Processing · Electrical Eng. & Systems 2024-12-30 Weizhen Ma , Zhongliang Jing , Peng Dong , Henry Leung

Big Data Regression Using Tree Based Segmentation

Scaling regression to large datasets is a common problem in many application areas. We propose a two step approach to scaling regression to large datasets. Using a regression tree (CART) to segment the large dataset constitutes the first…

Machine Learning · Statistics 2017-07-26 Rajiv Sambasivan , Sourish Das

Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to…

Data Structures and Algorithms · Computer Science 2026-03-24 David Mestel , Steven Chaplick , Steven Kelk , Ruben Meuwese

Dynamic Trees for Learning and Design

Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the…

Methodology · Statistics 2010-11-23 Matthew A. Taddy , Robert B. Gramacy , Nicholas G. Polson