Related papers: The Kernel Pitman-Yor Process
It has always been a great challenge for clustering algorithms to automatically determine the cluster numbers according to the distribution of datasets. Several approaches have been proposed to address this issue, including the recent…
We consider Bayesian nonparametric density estimation using a Pitman-Yor or a normalized inverse-Gaussian process kernel mixture as the prior distribution for a density. The procedure is studied from a frequentist perspective. Using the…
In this paper we propose and study a class of simple, nonparametric, yet interpretable measures of conditional dependence between two random variables $Y$ and $Z$ given a third variable $X$, all taking values in general topological spaces.…
We consider the problem of clustering a sample of probability distributions from a random distribution on $\mathbb R^p$. Our proposed partitioning method makes use of a symmetric, positive-definite kernel $k$ and its associated reproducing…
Kernel-based multivariate statistical process control (K-MSPC) extends classical monitoring to nonlinear industrial processes. Its performance depends critically on kernel parameters such as lengthscales and variance terms. In current…
The Pitman-Yor process is a random discrete probability distribution of which the atoms can be used to model the relative abundance of species. The process is indexed by a type parameter $\sigma$, which controls the number of different…
For a long time, the Dirichlet process has been the gold standard discrete random measure in Bayesian nonparametrics. The Pitman--Yor process provides a simple and mathematically tractable generalization, allowing for a very flexible…
One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are…
This paper develops a general asymptotic theory for nonparametric kernel regression in the presence of cluster dependence. We examine nonparametric density estimation, Nadaraya-Watson kernel regression, and local linear estimation. Our…
In this paper we consider approximations to the popular Pitman-Yor process obtained by truncating the stick-breaking representation. The truncation is determined by a random stopping rule that achieves an almost sure control on the…
This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of…
Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input.…
Stick-breaking has a long history and is one of the most popular procedures for constructing random discrete distributions in Statistics and Machine Learning. In particular, due to their intuitive construction and computational tractability…
We present BKP, a user-friendly and extensible R package that implements the Beta Kernel Process (BKP) -- a fully nonparametric and computationally efficient framework for modeling spatially varying binomial probabilities. The BKP model…
Kernel density estimation is a widely used nonparametric approach to estimate an unknown distribution. Recent work in Bayesian predictive inference has considered stochastic processes formed by specifying the predictive distribution for the…
We introduce a nonparametric way to estimate the global probability density function for a random persistence diagram. Precisely, a kernel density function centered at a given persistence diagram and a given bandwidth is constructed. Our…
The hierarchical Pitman-Yor process is a discrete random measure used as a prior in Bayesian nonparametrics. It is motivated by the study of groups of clustered data exhibiting power law behavior. Our focus in this paper is on the Gaussian…
We introduce the Pitman Yor Diffusion Tree (PYDT) for hierarchical clustering, a generalization of the Dirichlet Diffusion Tree (Neal, 2001) which removes the restriction to binary branching structure. The generative process is described…
This paper introduces Kernel-based Information Criterion (KIC) for model selection in regression analysis. The novel kernel-based complexity measure in KIC efficiently computes the interdependency between parameters of the model using a…
The Pitman-Yor process is a random probability distribution, that can be used as a prior distribution in a nonparametric Bayesian analysis. The process is of species sampling type and generates discrete distributions, which yield of the…