Related papers: An Alternative Prior Process for Nonparametric Bay…
The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for…
One of the most used priors in Bayesian clustering is the Dirichlet prior. It can be expressed as a Chinese Restaurant Process. This process allows nonparametric estimation of the number of clusters when partitioning datasets. Its key…
Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, recent results proved posterior inconsistency of the number of clusters when the true number of…
Discrete random probability measures and the exchangeable random partitions they induce are key tools for addressing a variety of estimation and prediction problems in Bayesian inference. Indeed, many popular nonparametric priors, such as…
In Bayesian nonparametrics there exists a rich variety of discrete priors, including the Dirichlet process and its generalizations, which are nowadays well-established tools. Despite the remarkable advances, few proposals are tailored for…
Dirichlet process mixtures are particularly sensitive to the value of the precision parameter controlling the behavior of the latent partition. Randomization of the precision through a prior distribution is a common solution, which leads to…
We consider Bayesian nonparametric density estimation using a Pitman-Yor or a normalized inverse-Gaussian process kernel mixture as the prior distribution for a density. The procedure is studied from a frequentist perspective. Using the…
Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often…
We consider the problem of estimating Shannon's entropy $H$ from discrete data, in cases where the number of possible symbols is unknown or even countably infinite. The Pitman-Yor process, a generalization of Dirichlet process, provides a…
Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems. While the number of clusters induced by random partition models has been…
There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable…
A family of random probabilities is defined and studied. This family contains the Dirichlet process as a special case, corresponding to an inner point in the appropriate parameter space. The extension makes it possible to have random means…
Bayesian clustering methods have the widely touted advantage of providing a probabilistic characterization of uncertainty in clustering through the posterior distribution. An amazing variety of priors and likelihoods have been proposed for…
We develop a Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in tasks such as reference matching, coreference resolution, identity uncertainty and record linkage. Our clustering model is…
This paper introduces and studies a new class of nonparametric prior distributions. Random probability distribution functions are constructed via normalization of random measures driven by increasing additive processes. In particular, we…
Dirichlet process mixture models (DPMM) play a central role in Bayesian nonparametrics, with applications throughout statistics and machine learning. DPMMs are generally used in clustering problems where the number of clusters is not known…
Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent…
Exemplar-based clustering methods have been shown to produce state-of-the-art results on a number of synthetic and real-world clustering problems. They are appealing because they offer computational benefits over latent-mean models and can…
It has always been a great challenge for clustering algorithms to automatically determine the cluster numbers according to the distribution of datasets. Several approaches have been proposed to address this issue, including the recent…
In this paper, we provide an explicit probability distribution for classification purposes. It is derived from the Bayesian nonparametric mixture of Dirichlet process model, but with suitable modifications which remove unsuitable aspects of…