English

posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

Computation 2024-07-09 v1

Abstract

The generality and robustness of inference algorithms is critical to the success of widely used probabilistic programming languages such as Stan, PyMC, Pyro, and Turing.jl. When designing a new general-purpose inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem arises in evaluating its accuracy and efficiency across a range of representative target models. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. We further provide a guide to the best practices in using posteriordb for model evaluation and comparison. To provide a wide range of realistic target densities, posteriordb currently comprises 120 representative models and has been instrumental in developing several general inference algorithms.

Keywords

Cite

@article{arxiv.2407.04967,
  title  = {posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms},
  author = {Måns Magnusson and Jakob Torgander and Paul-Christian Bürkner and Lu Zhang and Bob Carpenter and Aki Vehtari},
  journal= {arXiv preprint arXiv:2407.04967},
  year   = {2024}
}
R2 v1 2026-06-28T17:31:05.787Z