English

Algorithmically Effective Differentially Private Synthetic Data

Data Structures and Algorithms 2023-05-22 v3 Cryptography and Security Probability Statistics Theory Statistics Theory

Abstract

We present a highly effective algorithmic approach for generating ε\varepsilon-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset XX in the hypercube [0,1]d[0,1]^d, our algorithm generates synthetic dataset YY such that the expected 1-Wasserstein distance between the empirical measure of XX and YY is O((εn)1/d)O((\varepsilon n)^{-1/d}) for d2d\geq 2, and is O(log2(εn)(εn)1)O(\log^2(\varepsilon n)(\varepsilon n)^{-1}) for d=1d=1. The accuracy guarantee is optimal up to a constant factor for d2d\geq 2, and up to a logarithmic factor for d=1d=1. Our algorithm has a fast running time of O(εdn)O(\varepsilon dn) for all d1d\geq 1 and demonstrates improved accuracy compared to the method in (Boedihardjo et al., 2022) for d2d\geq 2.

Keywords

Cite

@article{arxiv.2302.05552,
  title  = {Algorithmically Effective Differentially Private Synthetic Data},
  author = {Yiyun He and Roman Vershynin and Yizhe Zhu},
  journal= {arXiv preprint arXiv:2302.05552},
  year   = {2023}
}

Comments

23 pages. to appear in COLT 2023

R2 v1 2026-06-28T08:37:30.582Z