English

Estimating a probability mass function with unknown labels

Statistics Theory 2018-01-12 v5 Statistics Theory

Abstract

In the context of a species sampling problem we discuss a non-parametric maximum likelihood estimator for the underlying probability mass function. The estimator is known in the computer science literature as the high profile estimator. We prove strong consistency and derive the rates of convergence, for an extended model version of the estimator. We also study a sieved estimator for which similar consistency results are derived. Numerical computation of the sieved estimator is of great interest for practical problems, such as forensic DNA analysis, and we present a computational algorithm based on the stochastic approximation of the expectation maximisation algorithm. As an interesting byproduct of the numerical analyses we introduce an algorithm for bounded isotonic regression for which we also prove convergence.

Keywords

Cite

@article{arxiv.1312.1200,
  title  = {Estimating a probability mass function with unknown labels},
  author = {Dragi Anevski and Richard D. Gill and Stefan Zohren},
  journal= {arXiv preprint arXiv:1312.1200},
  year   = {2018}
}

Comments

45 pages (30 page+15 page appendix), v5 corrects a technical problem with the pdf output of v4, otherwise unchanged

R2 v1 2026-06-22T02:20:43.743Z