Related papers: Improved Generalization Guarantees in Restricted D…
While pursuing better utility by discovering knowledge from the data, individual's privacy may be compromised during an analysis. To that end, differential privacy has been widely recognized as the state-of-the-art privacy notion. By…
Differential privacy is a popular privacy model within the research community because of the strong privacy guarantee it offers, namely that the presence or absence of any individual in a data set does not significantly influence the…
Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model. It has been recently shown that simple heuristics can reconstruct data samples from language models, making this…
Recent research in differential privacy demonstrated that (sub)sampling can amplify the level of protection. For example, for $\epsilon$-differential privacy and simple random sampling with sampling rate $r$, the actual privacy guarantee is…
Differentially private training algorithms provide protection against one of the most popular attacks in machine learning: the membership inference attack. However, these privacy algorithms incur a loss of the model's classification…
Privacy-preserving data analysis is a rising challenge in contemporary statistics, as the privacy guarantees of statistical methods are often achieved at the expense of accuracy. In this paper, we investigate the tradeoff between…
In modern settings of data analysis, we may be running our algorithms on datasets that are sensitive in nature. However, classical machine learning and statistical algorithms were not designed with these risks in mind, and it has been…
In large-scale statistical learning, data collection and model fitting are moving increasingly toward peripheral devices---phones, watches, fitness trackers---away from centralized data collection. Concomitant with this rise in…
Differential privacy is becoming a gold standard for privacy research; it offers a guaranteed bound on loss of privacy due to release of query results, even under worst-case assumptions. The theory of differential privacy is an active…
This paper presents ongoing research focused on improving the utility of data protected by Global Differential Privacy(DP) in the scenario of summary statistics. Our approach is based on predictions on how an analyst will use statistics…
Differential privacy is often applied with a privacy parameter that is larger than the theory suggests is ideal; various informal justifications for tolerating large privacy parameters have been proposed. In this work, we consider partial…
Concern about how to aggregate sensitive user data without compromising individual privacy is a major barrier to greater availability of data. The model of differential privacy has emerged as an accepted model to release sensitive…
Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $\epsilon$, about how much information is leaked by a mechanism. However, implementations of privacy-preserving…
Differential privacy (DP) is a popular mechanism for training machine learning models with bounded leakage about the presence of specific points in the training data. The cost of differential privacy is a reduction in the model's accuracy.…
This work studies formal utility and privacy guarantees for a simple multiplicative database transformation, where the data are compressed by a random linear or affine transformation, reducing the number of data records substantially, while…
Privacy protection and uncertainty quantification are increasingly important in data-driven decision making. Conformal prediction provides finite-sample marginal coverage, but existing private approaches often rely on data splitting,…
Differential privacy is becoming one gold standard for protecting the privacy of publicly shared data. It has been widely used in social science, data science, public health, information technology, and the U.S. decennial census.…
Privacy models were introduced in privacy-preserving data publishing and statistical disclosure control with the promise to end the need for costly empirical assessment of disclosure risk. We examine how well this promise is kept by the…
In statistical disclosure control, the goal of data analysis is twofold: The released information must provide accurate and useful statistics about the underlying population of interest, while minimizing the potential for an individual…
Differential privacy allows quantifying privacy loss resulting from accessing sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this…