English

Reconstructing Training Data from Model Gradient, Provably

Machine Learning 2023-06-13 v3 Cryptography and Security Machine Learning

Abstract

Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.

Keywords

Cite

@article{arxiv.2212.03714,
  title  = {Reconstructing Training Data from Model Gradient, Provably},
  author = {Zihan Wang and Jason D. Lee and Qi Lei},
  journal= {arXiv preprint arXiv:2212.03714},
  year   = {2023}
}
R2 v1 2026-06-28T07:24:51.077Z