Related papers: The Kernel Pitman-Yor Process

Non-parametric Power-law Data Clustering

It has always been a great challenge for clustering algorithms to automatically determine the cluster numbers according to the distribution of datasets. Several approaches have been proposed to address this issue, including the recent…

Machine Learning · Computer Science 2013-06-14 Xuhui Fan , Yiling Zeng , Longbing Cao

Adaptive Bayesian density estimation using Pitman-Yor or normalized inverse-Gaussian process kernel mixtures

We consider Bayesian nonparametric density estimation using a Pitman-Yor or a normalized inverse-Gaussian process kernel mixture as the prior distribution for a density. The procedure is studied from a frequentist perspective. Using the…

Statistics Theory · Mathematics 2013-02-15 Catia Scricciolo

Kernel Partial Correlation Coefficient -- a Measure of Conditional Dependence

In this paper we propose and study a class of simple, nonparametric, yet interpretable measures of conditional dependence between two random variables $Y$ and $Z$ given a third variable $X$, all taking values in general topological spaces.…

Methodology · Statistics 2022-09-20 Zhen Huang , Nabarun Deb , Bodhisattva Sen

Kernel K-means clustering of distributional data

We consider the problem of clustering a sample of probability distributions from a random distribution on $\mathbb R^p$. Our proposed partitioning method makes use of a symmetric, positive-definite kernel $k$ and its associated reproducing…

Machine Learning · Statistics 2025-09-23 Amparo Baíllo , Jose R. Berrendero , Martín Sánchez-Signorini

Probabilistic multivariate statistical process control via kernel parameter uncertainty propagation

Kernel-based multivariate statistical process control (K-MSPC) extends classical monitoring to nonlinear industrial processes. Its performance depends critically on kernel parameters such as lengthscales and variance terms. In current…

Applications · Statistics 2026-03-20 Zina-Sabrina Duma , Victoria Jorry , Ayesha Safraz , Maria Paola di Crosta , Tuomas Sihvonen , Lassi Roininen , Satu-Pia Reinikainen

Empirical and Full Bayes estimation of the type of a Pitman-Yor process

The Pitman-Yor process is a random discrete probability distribution of which the atoms can be used to model the relative abundance of species. The process is indexed by a type parameter $\sigma$, which controls the number of different…

Statistics Theory · Mathematics 2022-08-31 S. E. M. P. Franssen , A. W. van der Vaart

A simple proof of Pitman-Yor's Chinese restaurant process from its stick-breaking representation

For a long time, the Dirichlet process has been the gold standard discrete random measure in Bayesian nonparametrics. The Pitman--Yor process provides a simple and mathematically tractable generalization, allowing for a very flexible…

Statistics Theory · Mathematics 2020-01-08 Caroline Lawless , Julyan Arbel

Dataset Meta-Learning from Kernel Ridge-Regression

One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are…

Machine Learning · Computer Science 2021-03-24 Timothy Nguyen , Zhourong Chen , Jaehoon Lee

Nonparametric Regression under Cluster Sampling

This paper develops a general asymptotic theory for nonparametric kernel regression in the presence of cluster dependence. We examine nonparametric density estimation, Nadaraya-Watson kernel regression, and local linear estimation. Our…

Econometrics · Economics 2024-12-31 Yuya Shimizu

Stochastic approximations to the Pitman-Yor process

In this paper we consider approximations to the popular Pitman-Yor process obtained by truncating the stick-breaking representation. The truncation is determined by a random stopping rule that achieves an almost sure control on the…

Statistics Theory · Mathematics 2019-07-16 Julyan Arbel , Pierpaolo De Blasi , Igor Pruenster

The Kernel Mixture Network: A Nonparametric Method for Conditional Density Estimation of Continuous Random Variables

This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of…

Machine Learning · Statistics 2017-05-22 Luca Ambrogioni , Umut Güçlü , Marcel A. J. van Gerven , Eric Maris

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input.…

Machine Learning · Computer Science 2013-09-27 Amar Shah , Zoubin Ghahramani

Markov Stick-breaking Processes

Stick-breaking has a long history and is one of the most popular procedures for constructing random discrete distributions in Statistics and Machine Learning. In particular, due to their intuitive construction and computational tractability…

Statistics Theory · Mathematics 2026-01-26 María F. Gil-Leyva , Antonio Lijoi , Ramsés H. Mena , Igor Prünster

BKP: An R Package for Beta Kernel Process Modeling

We present BKP, a user-friendly and extensible R package that implements the Beta Kernel Process (BKP) -- a fully nonparametric and computationally efficient framework for modeling spatially varying binomial probabilities. The BKP model…

Computation · Statistics 2025-09-16 Jiangyan Zhao , Kunhai Qing , Jin Xu

Predictive Inference via Kernel Density Estimates

Kernel density estimation is a widely used nonparametric approach to estimate an unknown distribution. Recent work in Bayesian predictive inference has considered stochastic processes formed by specifying the predictive distribution for the…

Methodology · Statistics 2026-05-15 Torey Hilbert

Nonparametric Estimation of Probability Density Functions of Random Persistence Diagrams

We introduce a nonparametric way to estimate the global probability density function for a random persistence diagram. Precisely, a kernel density function centered at a given persistence diagram and a given bandwidth is constructed. Our…

Statistics Theory · Mathematics 2018-03-14 Joshua Lee Mike , Vasileios Maroulas

Central limit theorem for the homozygosity of the hierarchical Pitman-Yor process

The hierarchical Pitman-Yor process is a discrete random measure used as a prior in Bayesian nonparametrics. It is motivated by the study of groups of clustered data exhibiting power law behavior. Our focus in this paper is on the Gaussian…

Probability · Mathematics 2026-05-13 Shui Feng , J. E. Paguyo

Pitman-Yor Diffusion Trees

We introduce the Pitman Yor Diffusion Tree (PYDT) for hierarchical clustering, a generalization of the Dirichlet Diffusion Tree (Neal, 2001) which removes the restriction to binary branching structure. The generative process is described…

Machine Learning · Statistics 2011-06-17 David A. Knowles , Zoubin Ghahramani

Kernel-based Information Criterion

This paper introduces Kernel-based Information Criterion (KIC) for model selection in regression analysis. The novel kernel-based complexity measure in KIC efficiently computes the interdependency between parameters of the model using a…

Machine Learning · Statistics 2014-12-16 Somayeh Danafar , Kenji Fukumizu , Faustino Gomez

The Bernstein-von Mises theorem for the Pitman-Yor process of nonnegative type

The Pitman-Yor process is a random probability distribution, that can be used as a prior distribution in a nonparametric Bayesian analysis. The process is of species sampling type and generates discrete distributions, which yield of the…

Statistics Theory · Mathematics 2021-12-10 S. E. M. P. Franssen , A. W. van der Vaart