Related papers: Diffix Elm: Simple Diffix

Diffix-Birch: Extending Diffix-Aspen

A longstanding open problem is that of how to get high quality statistics through direct queries to databases containing information about individuals without revealing information specific to those individuals. Diffix is a framework for…

Cryptography and Security · Computer Science 2019-08-22 Paul Francis , Sebastian Probst-Eide , Pawel Obrok , Cristian Berneanu , Sasa Juric , Reinhard Munz

Element Level Differential Privacy: The Right Granularity of Privacy

Differential Privacy (DP) provides strong guarantees on the risk of compromising a user's data in statistical learning applications, though these strong protections make learning challenging and may be too stringent for some use cases. To…

Machine Learning · Computer Science 2019-12-10 Hilal Asi , John Duchi , Omid Javidbakht

When the signal is in the noise: Exploiting Diffix's Sticky Noise

Anonymized data is highly valuable to both businesses and researchers. A large body of research has however shown the strong limits of the de-identification release-and-forget model, where data is anonymized and shared. This has led to the…

Cryptography and Security · Computer Science 2019-10-31 Andrea Gadotti , Florimond Houssiau , Luc Rocher , Benjamin Livshits , Yves-Alexandre de Montjoye

FLAIM: A Multi-level Anonymization Framework for Computer and Network Logs

FLAIM (Framework for Log Anonymization and Information Management) addresses two important needs not well addressed by current log anonymizers. First, it is extremely modular and not tied to the specific log being anonymized. Second, it…

Cryptography and Security · Computer Science 2007-05-23 Adam Slagell , Kiran Lakkaraju , Katherine Luo

APEx: Accuracy-Aware Differentially Private Data Exploration

Organizations are increasingly interested in allowing external data scientists to explore their sensitive datasets. Due to the popularity of differential privacy, data owners want the data exploration to ensure provable privacy guarantees.…

Databases · Computer Science 2019-05-14 Chang Ge , Xi He , Ihab F. Ilyas , Ashwin Machanavajjhala

Model-based Large Language Model Customization as Service

Prominent Large Language Model (LLM) services from providers like OpenAI and Google excel at general tasks but often underperform on domain-specific applications. Current customization services for these LLMs typically require users to…

Machine Learning · Computer Science 2026-02-17 Zhaomin Wu , Jizhou Guo , Junyi Hou , Bingsheng He , Lixin Fan , Qiang Yang

Individual Differential Privacy: A Utility-Preserving Formulation of Differential Privacy Guarantees

Differential privacy is a popular privacy model within the research community because of the strong privacy guarantee it offers, namely that the presence or absence of any individual in a data set does not significantly influence the…

Cryptography and Security · Computer Science 2017-02-09 Jordi Soria-Comas , Josep Domingo-Ferrer , David Sánchez , David Megías

A Comparison of SynDiffix Multi-table versus Single-table Synthetic Data

SynDiffix is a new open-source tool for structured data synthesis. It has anonymization features that allow it to generate multiple synthetic tables while maintaining strong anonymity. Compared to the more common single-table approach,…

Cryptography and Security · Computer Science 2024-03-14 Paul Francis

DPBloomfilter: Securing Bloom Filters with Differential Privacy

The Bloom filter is a simple yet space-efficient probabilistic data structure that supports membership queries for dramatically large datasets. It is widely utilized and implemented across various industrial scenarios, often handling…

Cryptography and Security · Computer Science 2026-01-26 Yekun Ke , Yingyu Liang , Zhizhou Sha , Zhenmei Shi , Zhao Song , Jiahao Zhang

Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained ERM

Traditional approaches to differential privacy assume a fixed privacy requirement $\epsilon$ for a computation, and attempt to maximize the accuracy of the computation subject to the privacy constraint. As differential privacy is…

Machine Learning · Computer Science 2017-06-01 Katrina Ligett , Seth Neel , Aaron Roth , Bo Waggoner , Z. Steven Wu

Plume: Differential Privacy at Scale

Differential privacy has become the standard for private data analysis, and an extensive literature now offers differentially private solutions to a wide variety of problems. However, translating these solutions into practical systems often…

Cryptography and Security · Computer Science 2022-01-28 Kareem Amin , Jennifer Gillenwater , Matthew Joseph , Alex Kulesza , Sergei Vassilvitskii

A Noise Addition Scheme in Decision Tree for Privacy Preserving Data Mining

Data mining deals with automatic extraction of previously unknown patterns from large amounts of data. Organizations all over the world handle large amounts of data and are dependent on mining gigantic data sets for expansion of their…

Cryptography and Security · Computer Science 2010-03-25 Mohammad Ali Kadampur , Somayajulu D. V. L. N

Properties of Effective Information Anonymity Regulations

A firm seeks to analyze a dataset and to release the results. The dataset contains information about individual people, and the firm is subject to some regulation that forbids the release of the dataset itself. The regulation also imposes…

Computers and Society · Computer Science 2024-08-28 Aloni Cohen , Micah Altman , Francesca Falzon , Evangelina Anna Markatou , Kobbi Nissim

Perturbed M-Estimation: A Further Investigation of Robust Statistics for Differential Privacy

Differential Privacy (DP) provides an elegant mathematical framework for defining a provable disclosure risk in the presence of arbitrary adversaries; it guarantees that whether an individual is in a database or not, the results of a DP…

Cryptography and Security · Computer Science 2021-08-19 Aleksandra Slavkovic , Roberto Molinari

On the Importance of Conditioning for Privacy-Preserving Data Augmentation

Latent diffusion models can be used as a powerful augmentation method to artificially extend datasets for enhanced training. To the human eye, these augmented images look very different to the originals. Previous work has suggested to use…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Julian Lorenz , Katja Ludwig , Valentin Haug , Rainer Lienhart

PEEPLL: Privacy-Enhanced Event Pseudonymisation with Limited Linkability

Pseudonymisation provides the means to reduce the privacy impact of monitoring, auditing, intrusion detection, and data collection in general on individual subjects. Its application on data records, especially in an environment with…

Cryptography and Security · Computer Science 2020-04-22 Ephraim Zimmer , Christian Burkert , Tom Petersen , Hannes Federrath

Privacy Preservation by Disassociation

In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of…

Databases · Computer Science 2012-07-03 Manolis Terrovitis , John Liagouris , Nikos Mamoulis , Spiros Skiadopoulos

Formalization of Differential Privacy in Isabelle/HOL

Differential privacy is a statistical definition of privacy that has attracted the interest of both academia and industry. Its formulations are easy to understand, but the differential privacy of databases is complicated to determine. One…

Logic in Computer Science · Computer Science 2024-10-25 Tetsuya Sato , Yasuhiko Minamide

Diversifying Anonymized Data with Diversity Constraints

Recently introduced privacy legislation has aimed to restrict and control the amount of personal data published by companies and shared to third parties. Much of this real data is not only sensitive requiring anonymization, but also…

Databases · Computer Science 2020-07-20 Mostafa Milani , Yu Huang , Fei Chiang

Slicing: A New Approach to Privacy Preserving Data Publishing

Several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving microdata publishing. Recent work has shown that generalization loses considerable amount of information, especially for…

Databases · Computer Science 2009-09-15 Tiancheng Li , Ninghui Li , Jian Zhang , Ian Molloy