Bayesian Hypernetworks

David Krueger; Chin-Wei Huang; Riashat Islam; Ryan Turner; Alexandre Lacoste; Aaron Courville

Bayesian Hypernetworks

Machine Learning 2018-04-26 v2 Artificial Intelligence Machine Learning

Authors: David Krueger , Chin-Wei Huang , Riashat Islam , Ryan Turner , Alexandre Lacoste , Aaron Courville

Abstract

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vec\epsilon) = \N(\vec 0,\mat I)$ , to a distribution $q(\pp) := q(h(\vec\epsilon))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~ $q(\pp)$ . In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

Keywords

neural networks bayesian inference bayesian shrinkage

Cite

@article{arxiv.1710.04759,
  title  = {Bayesian Hypernetworks},
  author = {David Krueger and Chin-Wei Huang and Riashat Islam and Ryan Turner and Alexandre Lacoste and Aaron Courville},
  journal= {arXiv preprint arXiv:1710.04759},
  year   = {2018}
}

Comments

David Krueger and Chin-Wei Huang contributed equally

Bayesian Hypernetworks

Abstract

Keywords

Cite

Comments

Related papers