English

Statistical Inference for Generative Model Comparison

Machine Learning 2025-10-24 v3 Machine Learning

Abstract

Generative models have achieved remarkable success across a range of applications, yet their evaluation still lacks principled uncertainty quantification. In this paper, we develop a method for comparing how close different generative models are to the underlying distribution of test samples. Particularly, our approach employs the Kullback-Leibler (KL) divergence to measure the distance between a generative model and the unknown test distribution, as KL requires no tuning parameters such as the kernels used by RKHS-based distances, and is the only ff-divergence that admits a crucial cancellation to enable the uncertainty quantification. Furthermore, we extend our method to comparing conditional generative models and leverage Edgeworth expansions to address limited-data settings. On simulated datasets with known ground truth, we show that our approach realizes effective coverage rates, and has higher power compared to kernel-based methods. When applied to generative models on image and text datasets, our procedure yields conclusions consistent with benchmark metrics but with statistical confidence.

Keywords

Cite

@article{arxiv.2501.18897,
  title  = {Statistical Inference for Generative Model Comparison},
  author = {Zijun Gao and Yan Sun and Han Su},
  journal= {arXiv preprint arXiv:2501.18897},
  year   = {2025}
}

Comments

35 pages, 25 figures