Related papers: Forest Learning from Data and its Universal Coding

A Generalization of the Chow-Liu Algorithm and its Application to Statistical Learning

We extend the Chow-Liu algorithm for general random variables while the previous versions only considered finite cases. In particular, this paper applies the generalization to Suzuki's learning algorithm that generates from data forests…

Information Theory · Computer Science 2010-02-12 Joe Suzuki

Optimal Rates for Learning Hidden Tree Structures

We provide high probability finite sample complexity guarantees for hidden non-parametric structure learning of tree-shaped graphical models, whose hidden and observable nodes are discrete random variables with either finite or countable…

Machine Learning · Statistics 2021-04-01 Konstantinos E. Nikolakakis , Dionysios S. Kalogerias , Anand D. Sarwate

Sample-Optimal and Efficient Learning of Tree Ising models

We show that $n$-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance $\epsilon$ from an optimal $O(n \ln n/\epsilon^2)$ samples, where $O(\cdot)$ hides an absolute constant…

Machine Learning · Computer Science 2020-12-01 Constantinos Daskalakis , Qinxuan Pan

Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models

We consider the problem of learning a tree-structured Ising model from data, such that subsequent predictions computed using the model are accurate. Concretely, we aim to learn a model such that posteriors $P(X_i|X_S)$ for small sets of…

Machine Learning · Computer Science 2021-11-25 Enric Boix-Adsera , Guy Bresler , Frederic Koehler

Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both…

Information Theory · Computer Science 2011-02-15 Vincent Y. F. Tan , Animashree Anandkumar , Alan S. Willsky

Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu

We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution $P$ on $\Sigma^n$ and a tree $T$ on $n$ nodes, we…

Data Structures and Algorithms · Computer Science 2021-07-23 Arnab Bhattacharyya , Sutanu Gayen , Eric Price , N. V. Vinodchandran

Optimal estimation of Gaussian (poly)trees

We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first…

Machine Learning · Computer Science 2024-02-12 Yuhao Wang , Ming Gao , Wai Ming Tai , Bryon Aragam , Arnab Bhattacharyya

Latent Tree Approximation in Linear Model

We consider the problem of learning underlying tree structure from noisy, mixed data obtained from a linear model. To achieve this, we use the expectation maximization algorithm combined with Chow-Liu minimum spanning tree algorithm. This…

Information Theory · Computer Science 2017-10-06 Navid Tafaghodi Khajavi

Fault Trees from Data: Efficient Learning with an Evolutionary Algorithm

Cyber-physical systems come with increasingly complex architectures and failure modes, which complicates the task of obtaining accurate system reliability models. At the same time, with the emergence of the (industrial) Internet-of-Things,…

Formal Languages and Automata Theory · Computer Science 2019-09-16 Alexis Linard , Doina Bucur , Marielle Stoelinga

Robust estimation of tree structured models

Consider the problem of learning undirected graphical models on trees from corrupted data. Recently Katiyar et al. showed that it is possible to recover trees from noisy binary data up to a small equivalence class of possible trees. Their…

Machine Learning · Statistics 2021-02-11 Marta Casanellas , Marina Garrote-López , Piotr Zwiernik

Data Selection: A General Principle for Building Small Interpretable Models

We present convincing empirical evidence for an effective and general strategy for building accurate small models. Such models are attractive for interpretability and also find use in resource-constrained environments. The strategy is to…

Machine Learning · Computer Science 2024-04-30 Abhishek Ghose

Uncharted Forest a Technique for Exploratory Data Analysis

Exploratory data analysis is crucial for developing and understanding classification models from high-dimensional datasets. We explore the utility of a new unsupervised tree ensemble called uncharted forest for visualizing class…

Machine Learning · Statistics 2018-07-03 Casey Kneale , Steven D. Brown

Learning Staged Trees from Incomplete Data

Staged trees are probabilistic graphical models capable of representing any class of non-symmetric independence via a coloring of its vertices. Several structural learning routines have been defined and implemented to learn staged trees…

Machine Learning · Statistics 2024-05-29 Jack Storror Carter , Manuele Leonelli , Eva Riccomagno , Gherardo Varando

LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation

Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential…

Machine Learning · Computer Science 2025-08-26 Xinrui He , Yikun Ban , Jiaru Zou , Tianxin Wei , Curtiss B. Cook , Jingrui He

Learning Polytrees

We consider the task of learning the maximum-likelihood polytree from data. Our first result is a performance guarantee establishing that the optimal branching (or Chow-Liu tree), which can be computed very easily, constitutes a good…

Artificial Intelligence · Computer Science 2013-01-30 Sanjoy Dasgupta

Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees

The Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) has been a mainstay for the learning of tree-structured graphical models from i.i.d.\ sampled data vectors. Its theoretical properties have been well-studied and are well-understood.…

Machine Learning · Computer Science 2021-10-29 Fengzhuo Zhang , Anshoo Tandon , Vincent Y. F. Tan

Consistency of Random Forest Type Algorithms under a Probabilistic Impurity Decrease Condition

This paper derives a unifying theorem establishing consistency results for a broad class of tree-based algorithms. It improves current results in two aspects. First of all, it can be applied to algorithms that vary from traditional Random…

Statistics Theory · Mathematics 2024-02-22 Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph T. Meyer

Learning Latent Tree Graphical Models

We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees…

Machine Learning · Statistics 2010-09-15 Myung Jin Choi , Vincent Y. F. Tan , Animashree Anandkumar , Alan S. Willsky

Classification and regression tree methods for incomplete data from sample surveys

Analysis of sample survey data often requires adjustments to account for missing data in the outcome variables of principal interest. Standard adjustment methods based on item imputation or on propensity weighting factors rely heavily on…

Methodology · Statistics 2016-03-08 Wei-Yin Loh , John Eltinge , MoonJung Cho , Yuanzhi Li

CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits

We propose a novel algorithm for optimizing multivariate linear threshold functions as split functions of decision trees to create improved Random Forest classifiers. Standard tree induction methods resort to sampling and exhaustive search…

Machine Learning · Computer Science 2015-06-26 Mohammad Norouzi , Maxwell D. Collins , David J. Fleet , Pushmeet Kohli