English

Differentially Private Bayesian Learning on Distributed Data

Machine Learning 2017-05-30 v2 Cryptography and Security Machine Learning Computation

Abstract

Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness. We consider DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. We propose a learning strategy based on a secure multi-party sum function for aggregating summaries from data holders and the Gaussian mechanism for DP. Our method builds on an asymptotically optimal and practically efficient DP Bayesian inference with rapidly diminishing extra cost.

Keywords

Cite

@article{arxiv.1703.01106,
  title  = {Differentially Private Bayesian Learning on Distributed Data},
  author = {Mikko Heikkilä and Eemil Lagerspetz and Samuel Kaski and Kana Shimizu and Sasu Tarkoma and Antti Honkela},
  journal= {arXiv preprint arXiv:1703.01106},
  year   = {2017}
}

Comments

13 pages, 7 figures. Modified text, changed algorithm used, included tests on additional dataset, fixed several errors, added proof of asymptotic efficiency to supplement