English
Related papers

Related papers: Algorithmically Effective Differentially Private S…

200 papers

We present a polynomial-time algorithm for online differentially private synthetic data generation. For a data stream within the hypercube $[0,1]^d$ and an infinite time horizon, we develop an online algorithm that generates a…

Statistics Theory · Mathematics 2024-10-31 Yiyun He , Roman Vershynin , Yizhe Zhu

Differentially private synthetic data provide a powerful mechanism to enable data analysis while protecting sensitive information about individuals. However, when the data lie in a high-dimensional space, the accuracy of the synthetic data…

Machine Learning · Computer Science 2024-12-12 Yiyun He , Thomas Strohmer , Roman Vershynin , Yizhe Zhu

Networks are popular for representing complex data. In particular, differentially private synthetic networks are much in demand for method and algorithm development. The network generator should be easy to implement and should come with…

Machine Learning · Statistics 2025-02-18 Leoni Carla Wirth , Gholamali Aminian , Gesine Reinert

Differentially private synthetic data enables the sharing and analysis of sensitive datasets while providing rigorous privacy guarantees for individual contributors. A central challenge is to achieve strong utility guarantees for meaningful…

Statistics Theory · Mathematics 2026-02-06 Rundong Ding , Yiyun He , Yizhe Zhu

We propose $\mathtt{PrivHP}$, a lightweight synthetic data generator with \textit{differential privacy} guarantees. $\mathtt{PrivHP}$ uses a novel hierarchical decomposition that approximates the input's cumulative distribution function…

Cryptography and Security · Computer Science 2025-12-10 Rayne Holland , Seyit Camtepe , Chandra Thapa , Minhui Xue

Squared Wasserstein distance is a frequently used tool to measure discrepancy between probability distributions. This distance is typically computed between empirical measures of size $n$ from two underlying random samples. Unfortunately,…

Machine Learning · Statistics 2026-05-20 Peter Matthew Jacobs , Jeff M. Phillips

We present the first $\varepsilon$-differentially private, computationally efficient algorithm that estimates the means of product distributions over $\{0,1\}^d$ accurately in total-variation distance, whilst attaining the optimal sample…

Data Structures and Algorithms · Computer Science 2024-01-29 Vikrant Singhal

The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating…

Cryptography and Security · Computer Science 2024-05-09 Nikolija Bojkovic , Po-Ling Loh

Given a dataset of $n$ user-contributed strings, each of length at most $\ell$, a key problem is how to identify all frequent substrings while preserving each user's privacy. Recent work by Bernardini et al. (PODS'25) introduced a…

Data Structures and Algorithms · Computer Science 2026-03-11 Peaker Guo , Rayne Holland , Hao Wu

Techniques to deliver privacy-preserving synthetic datasets take a sensitive dataset as input and produce a similar dataset as output while maintaining differential privacy. These approaches have the potential to improve data sharing and…

Databases · Computer Science 2018-08-24 Luke Rodriguez , Bill Howe

We present three new algorithms for constructing differentially private synthetic data---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. All three algorithms…

Machine Learning · Computer Science 2020-07-13 Giuseppe Vietri , Grace Tian , Mark Bun , Thomas Steinke , Zhiwei Steven Wu

We study the problem of efficiently generating differentially private synthetic data that approximate the statistical properties of an underlying sensitive dataset. In recent years, there has been a growing line of work that approaches this…

Neural and Evolutionary Computing · Computer Science 2023-06-07 Terrance Liu , Jingwu Tang , Giuseppe Vietri , Zhiwei Steven Wu

Creation of a synthetic dataset that faithfully represents the data distribution and simultaneously preserves privacy is a major research challenge. Many space partitioning based approaches have emerged in recent years for answering…

Cryptography and Security · Computer Science 2023-06-26 Eleonora Kreačić , Navid Nouri , Vamsi K. Potluru , Tucker Balch , Manuela Veloso

Wasserstein distance is a key metric for quantifying data divergence from a distributional perspective. However, its application in privacy-sensitive environments, where direct sharing of raw data is prohibited, presents significant…

Machine Learning · Computer Science 2025-02-04 Wenqian Li , Yan Pang

Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to…

Cryptography and Security · Computer Science 2024-03-26 March Boedihardjo , Thomas Strohmer , Roman Vershynin

Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an $(\varepsilon, \delta)$-differentially private algorithm obtaining an…

Data Structures and Algorithms · Computer Science 2025-05-27 Syamantak Kumar , Daogao Liu , Kevin Tian , Chutong Yang

Protecting user data privacy can be achieved via many methods, from statistical transformations to generative models. However, all of them have critical drawbacks. For example, creating a transformed data set using traditional techniques is…

Machine Learning · Computer Science 2024-04-24 Tânia Carvalho , Nuno Moniz , Luís Antunes , Nitesh Chawla

The all-pairs shortest distances (APSD) with differential privacy (DP) problem takes as input an undirected, weighted graph $G = (V,E, \mathbf{w})$ and outputs a private estimate of the shortest distances in $G$ between all pairs of…

Data Structures and Algorithms · Computer Science 2024-07-16 Jesse Campbell , Chunjiang Zhu

We develop a general framework for statistical inference with the 1-Wasserstein distance. Recently, the Wasserstein distance has attracted considerable attention and has been widely applied to various machine learning tasks because of its…

Statistics Theory · Mathematics 2022-02-16 Masaaki Imaizumi , Hirofumi Ota , Takuo Hamaguchi

We consider accurately answering smooth queries while preserving differential privacy. A query is said to be $K$-smooth if it is specified by a function defined on $[-1,1]^d$ whose partial derivatives up to order $K$ are all bounded. We…

Databases · Computer Science 2014-01-07 Chi Jin , Ziteng Wang , Junliang Huang , Yiqiao Zhong , Liwei Wang
‹ Prev 1 2 3 10 Next ›