Related papers: The Binary Space Partitioning-Tree Process
The Binary Space Partitioning~(BSP)-Tree process is proposed to produce flexible 2-D partition structures which are originally used as a Bayesian nonparametric prior for relational modelling. It can hardly be applied to other learning tasks…
Space partitioning methods such as random forests and the Mondrian process are powerful machine learning methods for multi-dimensional and relational data, and are based on recursively cutting a domain. The flexibility of these methods is…
The Binary Space Partitioning-Tree~(BSP-Tree) process was recently proposed as an efficient strategy for space partitioning tasks. Because it uses more than one dimension to partition the space, the BSP-Tree Process is more efficient and…
Segmentation-based image coding methods provide high compression ratios when compared with traditional image coding approaches like the transform and sub band coding for low bit-rate compression applications. In this paper, a…
Shape modelling (with methods that output shapes) is a new and important task in Bayesian nonparametrics and bioinformatics. In this work, we focus on Bayesian nonparametric methods for capturing shapes by partitioning a space using curves.…
This report is concerned with the Mondrian process and its applications in machine learning. The Mondrian process is a guillotine-partition-valued stochastic process that possesses an elegant self-consistency property. The first part of the…
The stable under iterated tessellation (STIT) process is a stochastic process that produces a recursive partition of space with cut directions drawn independently from a distribution over the sphere. The case of random axis-aligned cuts is…
Bayesian nonparametric space partition (BNSP) models provide a variety of strategies for partitioning a $D$-dimensional space into a set of blocks. In this way, the data points lie in the same block would share certain kinds of homogeneity.…
Stochastic partition models divide a multi-dimensional space into a number of rectangular regions, such that the data within each region exhibit certain types of homogeneity. Due to the nature of their partition strategy, existing partition…
Bayesian model-based spatial clustering methods are widely used for their flexibility in estimating latent clusters with an unknown number of clusters while accounting for spatial proximity. Many existing methods are designed for clustering…
This paper introduces the Partition Tree Weighting technique, an efficient meta-algorithm for piecewise stationary sources. The technique works by performing Bayesian model averaging over a large class of possible partitions of the data…
Markov Chain Monte Carlo (MCMC) is a well-established family of algorithms primarily used in Bayesian statistics to sample from a target distribution when direct sampling is challenging. Existing work on Bayesian decision trees uses MCMC.…
We consider probabilistic inference in general hybrid networks, which include continuous and discrete variables in an arbitrary topology. We reexamine the question of variable discretization in a hybrid network aiming at minimizing the…
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are…
Partitioning large machine learning models across distributed accelerator systems is a complex process, requiring a series of interdependent decisions that are further complicated by internal sharding ambiguities. Consequently, existing…
We derive the exact partition function for a discrete model of random trees embedded in a one-dimensional space. These trees have vertices labeled by integers representing their position in the target space, with the SOS constraint that…
The tree-structured stick-breaking process (TS-SBP) mixture model is a non-parametric Bayesian model that can represent tree-like hierarchical structures among the mixture components. For TS-SBP mixture models, only a Markov chain Monte…
This work studies the statistical implications of using features comprised of general linear combinations of covariates to partition the data in randomized decision tree and forest regression algorithms. Using random tessellation theory in…
Introduced by Breiman, Random Forests are widely used classification and regression algorithms. While being initially designed as batch algorithms, several variants have been proposed to handle online learning. One particular instance of…
Spectral clustering views the similarity matrix as a weighted graph, and partitions the data by minimizing a graph-cut loss. Since it minimizes the across-cluster similarity, there is no need to model the distribution within each cluster.…