Related papers: An Algorithm for Streaming Differentially Private …
While differentially private synthetic data generation has been explored extensively in the literature, how to update this data in the future if the underlying private data changes is much less understood. We propose an algorithmic…
Differential privacy is a rigorous definition for privacy that guarantees that any analysis performed on a sensitive dataset leaks no information about the individuals whose data are contained therein. In this work, we develop new…
Differential privacy is the state-of-the-art definition for privacy, guaranteeing that any analysis performed on a sensitive dataset leaks no information about the individuals whose data are contained therein. In this thesis, we develop…
The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating…
Techniques to deliver privacy-preserving synthetic datasets take a sensitive dataset as input and produce a similar dataset as output while maintaining differential privacy. These approaches have the potential to improve data sharing and…
Artificial intelligence and data access are already mainstream. One of the main challenges when designing an artificial intelligence or disclosing content from a database is preserving the privacy of individuals who participate in the…
Estimating the quantiles of a large dataset is a fundamental problem in both the streaming algorithms literature and the differential privacy literature. However, all existing private mechanisms for distribution-independent quantile…
Differentially private data generation techniques have become a promising solution to the data privacy challenge -- it enables sharing of data while complying with rigorous privacy guarantees, which is essential for scientific progress in…
Creation of a synthetic dataset that faithfully represents the data distribution and simultaneously preserves privacy is a major research challenge. Many space partitioning based approaches have emerged in recent years for answering…
The literature on data sanitization aims to design algorithms that take an input dataset and produce a privacy-preserving version of it, that captures some of its statistical properties. In this note we study this question from a streaming…
Privacy-preserving estimation of counts of items in streaming data finds applications in several real-world scenarios including word auto-correction and traffic management applications. Recent works of RAPPOR and Apple's count-mean sketch…
The streaming model of computation is a popular approach for working with large-scale data. In this setting, there is a stream of items and the goal is to compute the desired quantities (usually data statistics) while making a single pass…
Private synthetic data sharing is preferred as it keeps the distribution and nuances of original data compared to summary statistics. The state-of-the-art methods adopt a select-measure-generate paradigm, but measuring large domain…
Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce…
While power systems research relies on the availability of real-world network datasets, data owners (e.g., system operators) are hesitant to share data due to security and privacy risks. To control these risks, we develop privacy-preserving…
In a world where artificial intelligence and data science become omnipresent, data sharing is increasingly locking horns with data-privacy concerns. Differential privacy has emerged as a rigorous framework for protecting individual privacy…
Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to…
Motivated by privacy concerns in long-term longitudinal studies in medical and social science research, we study the problem of continually releasing differentially private synthetic data from longitudinal data collections. We introduce a…
Online learning has been in the spotlight from the machine learning society for a long time. To handle massive data in Big Data era, one single learner could never efficiently finish this heavy task. Hence, in this paper, we propose a novel…
Differential privacy allows quantifying privacy loss resulting from accessing sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this…