English

General Bayesian Policy Learning

Machine Learning 2026-03-02 v1 Machine Learning Econometrics Statistics Theory Methodology Statistics Theory

Abstract

This study proposes the General Bayes framework for policy learning. We consider decision problems in which a decision-maker chooses an action from an action set to maximize its expected welfare. Typical examples include treatment choice and portfolio selection. In such problems, the statistical target is a decision rule, and the prediction of each outcome Y(a)Y(a) is not necessarily of primary interest. We formulate this policy learning problem by loss-based Bayesian updating. Our main technical device is a squared-loss surrogate for welfare maximization. We show that maximizing empirical welfare over a policy class is equivalent to minimizing a scaled squared error in the outcome difference, up to a quadratic regularization controlled by a tuning parameter ζ>0\zeta>0. This rewriting yields a General Bayes posterior over decision rules that admits a Gaussian pseudo-likelihood interpretation. We clarify two Bayesian interpretations of the resulting generalized posterior, a working Gaussian view and a decision-theoretic loss-based view. As one implementation example, we introduce neural networks with tanh-squashed outputs. Finally, we provide theoretical guarantees in a PAC-Bayes style.

Keywords

Cite

@article{arxiv.2602.23672,
  title  = {General Bayesian Policy Learning},
  author = {Masahiro Kato},
  journal= {arXiv preprint arXiv:2602.23672},
  year   = {2026}
}
R2 v1 2026-07-01T10:54:54.355Z