Related papers: Tree density estimation
Consider a density $f$ on $[0,1]$ that must be estimated from an i.i.d. sample $X_1,...,X_n$ drawn from $f$. In this note, we study binary-tree-based histogram estimates that use recursive splitting of intervals. If the decision to split an…
We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to…
This paper presents a brand new nonparametric density estimation strategy named the best-scored random forest density estimation whose effectiveness is supported by both solid theoretical analysis and significant experimental performance.…
For a density $f$ on ${\mathbb R}^d$, a {\it high-density cluster} is any connected component of $\{x: f(x) \geq \lambda\}$, for some $\lambda > 0$. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of…
Optimal transport provides a metric which quantifies the dissimilarity between probability measures. For measures supported in discrete metric spaces, finding the optimal transport distance has cubic time complexity in the size of the…
We study the arboricity A and the maximum number T of edge-disjoint spanning trees of the Erdos-Renyi random graph G(n,p). For all p(n) in [0,1], we show that, with high probability, T is precisely the minimum between delta and…
A subset of leaves of a rooted tree induces a new tree in a natural way. The density of a tree $D$ inside a larger tree $T$ is the proportion of such leaf-induced subtrees in $T$ that are isomorphic to $D$ among all those with the same…
Let $G$ be a connected graph in which almost all vertices have linear degrees and let $T$ be a uniform spanning tree of $G$. For any fixed rooted tree $F$ of height $r$ we compute the asymptotic density of vertices $v$ for which the…
The notion of tree entropy was introduced by the author as a normalized limit of the number of spanning trees in finite graphs, but is defined on random infinite rooted graphs. We give some new expressions for tree entropy; one uses…
Estimating a joint Highest Posterior Density credible set for a multivariate posterior density is challenging as dimension gets larger. Credible intervals for univariate marginals are usually presented for ease of computation and…
We answer three questions posed by Bubeck and Linial on the limit densities of subtrees in trees. We prove there exist positive $\varepsilon_1$ and $\varepsilon_2$ such that every tree that is neither a path nor a star has inducibility at…
We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p^\star$, the goal is to construct an estimator $\widehat{p}$ such that $\mathrm{KL}(p^\star,\widehat{p})$ is small with high…
Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method. In this paper, we propose an algorithm called "best-scored clustering forest" that can obtain the…
We study the convergence of the predictive surface of regression trees and forests. To support our analysis we introduce a notion of adaptive concentration for regression trees. This approach breaks tree training into a model selection…
Consider the nearest neighbor graph for the integer lattice Z^d in d dimensions. For a large finite piece of it, consider choosing a spanning tree for that piece uniformly among all possible subgraphs that are spanning trees. As the piece…
We study the problem of maximizing the number of full degree vertices in a spanning tree $T$ of a graph $G$; that is, the number of vertices whose degree in $T$ equals its degree in $G$. In cubic graphs, this problem is equivalent to…
We propose Partition Tree, a novel tree-based framework for conditional density estimation over general outcome spaces that supports both continuous and categorical variables within a unified formulation. Our approach models conditional…
The ratio of two densities provides a direct characterization of their differences. We consider the two-sample comparison problem by estimating this ratio given i.i.d. observations from two distributions. To this end, we propose additive…
Working with tree graphs is always easier than with loopy ones and spanning trees are the closest tree-like structures to a given graph. We find a correspondence between the solutions of random K-satisfiability problem and those of spanning…
The number of rooted spanning forests divided by the number of spanning rooted trees in a graph G with Kirchhoff matrix K is the spectral quantity tau(G)= det(1+K)/det(K) of G by the matrix tree and matrix forest theorems. We prove that…