Related papers: Tree based credible set estimation
Credible intervals and credible sets, such as highest posterior density (HPD) intervals, form an integral statistical tool in Bayesian phylogenetics, both for phylogenetic analyses and for development. Readily available for continuous…
In circular plot sampling, trees within a given distance from the sample plot location constitute a sample, which is used to infer characteristics of interest for the forest area. If the sample is collected using a technical device located…
We consider statistical inference in the density estimation model using a tree-based Bayesian approach, with Optional P\'olya trees as prior distribution. We derive near-optimal convergence rates for corresponding posterior distributions…
Consider a density $f$ on $[0,1]$ that must be estimated from an i.i.d. sample $X_1,...,X_n$ drawn from $f$. In this note, we study binary-tree-based histogram estimates that use recursive splitting of intervals. If the decision to split an…
In order to develop reliable services using machine learning, it is important to understand the uncertainty of the model outputs. Often the probability distribution that the prediction target follows has a complex shape, and a mixture…
We present sparse tree-based and list-based density estimation methods for binary/categorical data. Our density estimation models are higher dimensional analogies to variable bin width histograms. In each leaf of the tree (or list), the…
Posterior distributions for community structure in sparse planted bi-section models are shown to achieve exact (resp. almost-exact) recovery, with sharp bounds for the sparsity regimes where edge probabilities decrease as $O(\log(n)/n)$…
Kernel density estimation is a popular method for estimating unseen probability distributions. However, the convergence of these classical estimators to the true density slows down in high dimensions. Moreover, they do not define meaningful…
The recursive and hierarchical structure of full rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. In most of these cases, the full rooted tree is…
Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and…
A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters. It is estimated using the empirical tree, which is the cluster tree constructed from a density…
Frequentist coverage of $(1-\alpha)$-highest posterior density (HPD) credible sets is studied in a signal plus noise model under a large class of noise distributions. We consider a specific class of spike-and-slab prior distributions.…
In Bayesian statistics, the highest posterior density (HPD) interval is often used to describe properties of a posterior distribution. As a method for estimating confidence intervals (CIs), the HPD has two main desirable properties.…
Previously, we proposed a probabilistic data generation model represented by an unobservable tree and a sequential updating method to calculate a posterior distribution over a set of trees. The set is called a meta-tree. In this paper, we…
Given i.i.d samples from some unknown continuous density on hyper-rectangle $[0, 1]^d$, we attempt to learn a piecewise constant function that approximates this underlying density non-parametrically. Our density estimate is defined on a…
We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under…
We introduce a novel framework for uncertainty quantification in clustering that combines martingale posterior distributions with density-based clustering. Unlike classical model-based approaches, which define clusters at the latent level…
Due to their accuracies, methods based on ensembles of regression trees are a popular approach for making predictions. Some common examples include Bayesian additive regression trees, boosting and random forests. This paper focuses on…
For a density $f$ on ${\mathbb R}^d$, a {\it high-density cluster} is any connected component of $\{x: f(x) \geq \lambda\}$, for some $\lambda > 0$. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of…
We study the rates of convergence of the posterior distribution for Bayesian density estimation with Dirichlet mixtures of normal distributions as the prior. The true density is assumed to be twice continuously differentiable. The bandwidth…