Related papers: Learning Polytrees
We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first…
Chow and Liu (1968) studied the problem of learning a maximumlikelihood Markov tree. We generalize their work to more complexMarkov networks by considering the problem of learning a maximumlikelihood Markov network of bounded complexity. We…
Polytrees are a subclass of Bayesian networks that seek to capture the conditional dependencies between a set of $n$ variables as a directed forest and are motivated by their more efficient inference and improved interpretability. Since the…
We consider the problem of learning a tree-structured Ising model from data, such that subsequent predictions computed using the model are accurate. Concretely, we aim to learn a model such that posteriors $P(X_i|X_S)$ for small sets of…
We consider the problem of learning underlying tree structure from noisy, mixed data obtained from a linear model. To achieve this, we use the expectation maximization algorithm combined with Chow-Liu minimum spanning tree algorithm. This…
The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both…
We show that $n$-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance $\epsilon$ from an optimal $O(n \ln n/\epsilon^2)$ samples, where $O(\cdot)$ hides an absolute constant…
I consider the problem of learning an optimal path graphical model from data and show the problem to be NP-hard for the maximum likelihood and minimum description length approaches and a Bayesian approach. This hardness result holds despite…
We provide high probability finite sample complexity guarantees for hidden non-parametric structure learning of tree-shaped graphical models, whose hidden and observable nodes are discrete random variables with either finite or countable…
The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the…
We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution $P$ on $\Sigma^n$ and a tree $T$ on $n$ nodes, we…
We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a…
This paper considers structure learning from data with $n$ samples of $p$ variables, assuming that the structure is a forest, using the Chow-Liu algorithm. Specifically, for incomplete data, we construct two model selection algorithms that…
Inferring probabilistic networks from data is a notoriously difficult task. Under various goodness-of-fit measures, finding an optimal network is NP-hard, even if restricted to polytrees of bounded in-degree. Polynomial-time algorithms are…
Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and nonconvex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction…
Maximum likelihood is one of the most widely used techniques to infer evolutionary histories. Although it is thought to be intractable, a proof of its hardness has been lacking. Here, we give a short proof that computing the maximum…
A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data.…
We propose a reinforcement-learning algorithm to tackle the challenge of reconstructing phylogenetic trees. The search for the tree that best describes the data is algorithmically challenging, thus all current algorithms for phylogeny…
This paper advances the theoretical understanding of active learning label complexity for decision trees as binary classifiers. We make two main contributions. First, we provide the first analysis of the disagreement coefficient for…
The structure of an evolving network contains information about its past. Extracting this information efficiently, however, is, in general, a difficult challenge. We formulate a fast and efficient method to estimate the most likely history…