Related papers: Differentially Private Set Union
Many data analysis operations can be expressed as a GROUP BY query on an unbounded set of partitions, followed by a per-partition aggregation. To make such a query differentially private, adding noise to each aggregation is not enough: we…
We study the problem of top-$k$ selection over a large domain universe subject to user-level differential privacy. Typically, the exponential mechanism or report noisy max are the algorithms used to solve this problem. However, these…
In the differentially private partition selection problem (a.k.a. private set union, private key discovery), users hold subsets of items from an unbounded universe. The goal is to output as many items as possible from the union of the…
Much of the literature on differential privacy focuses on item-level privacy, where loosely speaking, the goal is to provide privacy per item or training example. However, recently many practical applications such as federated learning…
We revisit the problem of $n$-gram extraction in the differential privacy setting. In this problem, given a corpus of private text data, the goal is to release as many $n$-grams as possible while preserving user level privacy. Extracting…
To protect user privacy in data analysis, a state-of-the-art strategy is differential privacy in which scientific noise is injected into the real analysis output. The noise masks individual's sensitive information contained in the dataset.…
The application of Differential Privacy to Natural Language Processing techniques has emerged in relevance in recent years, with an increasing number of studies published in established NLP outlets. In particular, the adaptation of…
Partition selection, or set union, is an important primitive in differentially private mechanism design: in a database where each user contributes a list of items, the goal is to publish as many of these items as possible under differential…
While the introduction of differential privacy has been a major breakthrough in the study of privacy preserving data publication, some recent work has pointed out a number of cases where it is not possible to limit inference about…
Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization. While differential privacy…
Differential privacy is the de-facto privacy standard in data analysis. The classic model of differential privacy considers the data to be static. The dynamic setting, called differential privacy under continual observation, captures many…
A basic problem in the design of privacy-preserving algorithms is the private maximization problem: the goal is to pick an item from a universe that (approximately) maximizes a data-dependent function, all under the constraint of…
As multi-agent systems become more numerous and more data-driven, novel forms of privacy are needed in order to protect data types that are not accounted for by existing privacy frameworks. In this paper, we present a new form of privacy…
Differential privacy is the state-of-the-art definition for privacy, guaranteeing that any analysis performed on a sensitive dataset leaks no information about the individuals whose data are contained therein. In this thesis, we develop…
We describe a new algorithm for answering a given set of range queries under $\epsilon$-differential privacy which often achieves substantially lower error than competing methods. Our algorithm satisfies differential privacy by adding noise…
Differential privacy is a formal mathematical {stand-ard} for quantifying the degree of that individual privacy in a statistical database is preserved. To guarantee differential privacy, a typical method is adding random noise to the…
Differential privacy is becoming a gold standard for privacy research; it offers a guaranteed bound on loss of privacy due to release of query results, even under worst-case assumptions. The theory of differential privacy is an active…
Differentially private (DP) mechanisms face the challenge of providing accurate results while protecting their inputs: the privacy-utility trade-off. A simple but powerful technique for DP adds noise to sensitivity-bounded query outputs to…
With the increasing applications of language models, it has become crucial to protect these models from leaking private information. Previous work has attempted to tackle this challenge by training RNN-based language models with…
Concern about how to aggregate sensitive user data without compromising individual privacy is a major barrier to greater availability of data. The model of differential privacy has emerged as an accepted model to release sensitive…