English

Post-Processing Posterior Predictive P-values

Methodology 2026-05-26 v1

Abstract

This article addresses issues of model criticism and model comparison in Bayesian contexts, and focusses on the use of the so-called posterior predictive p-values (ppp values). These involve a general discrepancy or conflict measure and depend on the prior, the model, and the data. They are used in statistical practice to quantify the degree of surprise or conflict in data, and for purposes of comparing different combinations of prior and model. The distribution of such ppp values is however far from uniform, as we demonstrate for different models, making their interpretation and comparison a difficult matter. We propose a natural calibration of the ppp values, where the resulting cppp values are uniform on the unit interval under model conditions. The cppp values, which in general rely on a double simulation scheme for their computation, may then be used to assess and compare different priors and models. Our methods also make it possible to compare parametric with nonparametric model specifications, in that genuine `measures of surprise' are put on the same canonical uniform scale. Our techniques are illustrated for some applications to real data. We also present supplementing theoretical results on various properties of the ppp and cppp.

Keywords

Cite

@article{arxiv.2605.24169,
  title  = {Post-Processing Posterior Predictive P-values},
  author = {Nils Lid Hjort and Fredrik A. Dahl and Gunnhildur Högnadóttir Steinbakk},
  journal= {arXiv preprint arXiv:2605.24169},
  year   = {2026}
}

Comments

35 pages, 5 figures. This is the authors' Statistical Research Report, Department of Mathematics, University of Oslo, from 2005, later accepted in modified form in Journal of the American Statistician, 2006, vol. 101, pp 1157-1174