General Bayesian Policy Learning

Masahiro Kato

General Bayesian Policy Learning

Machine Learning 2026-03-02 v1 Machine Learning Econometrics Statistics Theory Methodology Statistics Theory

Authors: Masahiro Kato

Abstract

This study proposes the General Bayes framework for policy learning. We consider decision problems in which a decision-maker chooses an action from an action set to maximize its expected welfare. Typical examples include treatment choice and portfolio selection. In such problems, the statistical target is a decision rule, and the prediction of each outcome $Y(a)$ is not necessarily of primary interest. We formulate this policy learning problem by loss-based Bayesian updating. Our main technical device is a squared-loss surrogate for welfare maximization. We show that maximizing empirical welfare over a policy class is equivalent to minimizing a scaled squared error in the outcome difference, up to a quadratic regularization controlled by a tuning parameter $\zeta>0$ . This rewriting yields a General Bayes posterior over decision rules that admits a Gaussian pseudo-likelihood interpretation. We clarify two Bayesian interpretations of the resulting generalized posterior, a working Gaussian view and a decision-theoretic loss-based view. As one implementation example, we introduce neural networks with tanh-squashed outputs. Finally, we provide theoretical guarantees in a PAC-Bayes style.

Keywords

bayesian inference bayesian variable selection statistical learning theory

Cite

@article{arxiv.2602.23672,
  title  = {General Bayesian Policy Learning},
  author = {Masahiro Kato},
  journal= {arXiv preprint arXiv:2602.23672},
  year   = {2026}
}

General Bayesian Policy Learning

Abstract

Keywords

Cite

Related papers