Related papers: Private Data Release via Learning Thresholds
In this paper we demonstrate that, ignoring computational constraints, it is possible to privately release synthetic databases that are useful for large classes of queries -- much larger in size than the database itself. Specifically, we…
Concern about how to aggregate sensitive user data without compromising individual privacy is a major barrier to greater availability of data. The model of differential privacy has emerged as an accepted model to release sensitive…
We prove new upper and lower bounds on the sample complexity of $(\epsilon, \delta)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$…
We revisit the problem of accurately answering large classes of statistical queries while preserving differential privacy. Previous approaches to this problem have either been very general but have not had run-time polynomial in the size of…
We study the problem of answering \emph{$k$-way marginal} queries on a database $D \in (\{0,1\}^d)^n$, while preserving differential privacy. The answer to a $k$-way marginal query is the fraction of the database's records $x \in \{0,1\}^d$…
When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors. We propose to use the learning-augmented algorithms (or algorithms…
We study synthetic data release for answering multiple linear queries over a set of database tables in a differentially private way. Two special cases have been considered in the literature: how to release a synthetic dataset for answering…
Ensuring differential privacy of models learned from sensitive user data is an important goal that has been studied extensively in recent years. It is now known that for some basic learning problems, especially those involving…
In this paper, we study the problem of publishing a stream of real-valued data satisfying differential privacy (DP). One major challenge is that the maximal possible value can be quite large; thus it is necessary to estimate a threshold so…
Consider a database of $n$ people, each represented by a bit-string of length $d$ corresponding to the setting of $d$ binary attributes. A $k$-way marginal query is specified by a subset $S$ of $k$ attributes, and a $|S|$-dimensional binary…
Differential privacy is a de facto standard for statistical computations over databases that contain private data. The strength of differential privacy lies in a rigorous mathematical definition that guarantees individual privacy and yet…
We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of…
We study the problem of releasing $k$-way marginals of a database $D \in (\{0,1\}^d)^n$, while preserving differential privacy. The answer to a $k$-way marginal query is the fraction of $D$'s records $x \in \{0,1\}^d$ with a given value in…
Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to…
Differential privacy is a strong notion for protecting individual privacy in privacy preserving data analysis or publishing. In this paper, we study the problem of differentially private histogram release for random workloads. We study two…
Due to statistical lower bounds on the learnability of many function classes under privacy constraints, there has been recent interest in leveraging public data to improve the performance of private learning algorithms. In this model,…
Differential privacy is becoming a gold standard for privacy research; it offers a guaranteed bound on loss of privacy due to release of query results, even under worst-case assumptions. The theory of differential privacy is an active…
Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of…
The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when…
We consider accurately answering smooth queries while preserving differential privacy. A query is said to be $K$-smooth if it is specified by a function defined on $[-1,1]^d$ whose partial derivatives up to order $K$ are all bounded. We…