Related papers: The $k$-anonymity Problem is Hard

Parameterized Complexity of the k-anonymity Problem

The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization that has been recently proposed is the $k$-anonymity. This approach requires that the rows of a table are…

Data Structures and Algorithms · Computer Science 2013-11-20 Stefano Beretta , Paola Bonizzoni , Gianluca Della Vedova , Riccardo Dondi , Yuri Pirola

On the Complexity of the $k$-Anonymization Problem

We study the problem of anonymizing tables containing personal information before releasing them for public use. One of the formulations considered in this context is the $k$-anonymization problem: given a table, suppress a minimum number…

Computational Complexity · Computer Science 2010-04-28 Venkatesan T. Chakaravarthy , Vinayaka Pandit , Yogish Sabharwal

Resolving the Complexity of Some Data Privacy Problems

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving…

Computational Complexity · Computer Science 2010-04-26 Jeremiah Blocki , Ryan Williams

Privacy preserving clustering with constraints

The $k$-center problem is a classical combinatorial optimization problem which asks to find $k$ centers such that the maximum distance of any input point in a set $P$ to its assigned center is minimized. The problem allows for elegant…

Computational Complexity · Computer Science 2018-02-19 Clemens Rösner , Melanie Schmidt

On the Complexity of $t$-Closeness Anonymization and Related Problems

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as…

Data Structures and Algorithms · Computer Science 2013-01-10 Hongyu Liang , Hao Yuan

Anonymizing Graphs

Motivated by recently discovered privacy attacks on social networks, we study the problem of anonymizing the underlying graph of interactions in a social network. We call a graph (k,l)-anonymous if for every node in the graph there exist at…

Databases · Computer Science 2008-11-03 Tomas Feder , Shubha U. Nabar , Evimaria Terzi

An Algebraic Topological Approach to Privacy: Numerical and Categorical Data

In this paper, we cast the classic problem of achieving k-anonymity for a given database as a problem in algebraic topology. Using techniques from this field of mathematics, we propose a framework for k-anonymity that brings new insights…

Databases · Computer Science 2016-02-23 Alberto Speranzon , Shaunak D. Bopardikar

SKALD: Scalable K-Anonymisation for Large Datasets

Data privacy and anonymisation are critical concerns in today's data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation…

Information Theory · Computer Science 2025-07-02 Kailash Reddy , Novoneel Chakraborty , Amogh Dharmavaram , Anshoo Tandon

On the Approximability of Geometric and Geographic Generalization and the Min-Max Bin Covering Problem

We study the problem of abstracting a table of data about individuals so that no selection query can identify fewer than k individuals. We show that it is impossible to achieve arbitrarily good polynomial-time approximations for a number of…

Data Structures and Algorithms · Computer Science 2009-05-12 Wenliang Du , David Eppstein , Michael T. Goodrich , George S. Lueker

Achieving anonymity via weak lower bound constraints for k-median and k-means

We study $k$-clustering problems with lower bounds, including $k$-median and $k$-means clustering with lower bounds. In addition to the point set $P$ and the number of centers $k$, a $k$-clustering problem with (uniform) lower bounds gets a…

Data Structures and Algorithms · Computer Science 2021-08-18 Anna Arutyunova , Melanie Schmidt

$k$-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers

The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of…

Machine Learning · Computer Science 2022-06-23 Djordje Slijepčević , Maximilian Henzl , Lukas Daniel Klausner , Tobias Dam , Peter Kieseberg , Matthias Zeppelzauer

The Hardness and Approximation Algorithms for L-Diversity

The existing solutions to privacy preserving publication can be classified into the theoretical and heuristic categories. The former guarantees provably low information loss, whereas the latter incurs gigantic loss in the worst case, but is…

Databases · Computer Science 2009-12-31 Xiaokui Xiao , Ke Yi , Yufei Tao

Improving data utility in differential privacy and k-anonymity

We focus on two mainstream privacy models: k-anonymity and differential privacy. Once a privacy model has been selected, the goal is to enforce it while preserving as much data utility as possible. The main objective of this thesis is to…

Cryptography and Security · Computer Science 2013-07-04 Jordi Soria-Comas

Privacy Gain Based Multi-Iterative k-Anonymization to Protect Respondents Privacy

Huge volume of data from domain specific applications such as medical, financial, telephone, shopping records and individuals are regularly generated. Sharing of these data is proved to be beneficial for data mining application. Since data…

Methodology · Statistics 2014-03-21 Hitesh Chhinkaniwala , Sanjay Garg

On Sampling, Anonymization, and Differential Privacy: Or, k-Anonymization Meets Differential Privacy

This paper aims at answering the following two questions in privacy-preserving data analysis and publishing: What formal privacy guarantee (if any) does $k$-anonymization provide? How to benefit from the adversary's uncertainty about the…

Cryptography and Security · Computer Science 2015-03-17 Ninghui Li , Wahbeh Qardaji , Dong Su

t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation

Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate $k$-anonymous data sets,…

Cryptography and Security · Computer Science 2016-08-06 Jordi Soria-Comas , Josep Domingo-Ferrer , David Sánchez , Sergio Martínez

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Consider a database of $n$ people, each represented by a bit-string of length $d$ corresponding to the setting of $d$ binary attributes. A $k$-way marginal query is specified by a subset $S$ of $k$ attributes, and a $|S|$-dimensional binary…

Data Structures and Algorithms · Computer Science 2013-08-07 Cynthia Dwork , Aleksandar Nikolov , Kunal Talwar

Anonymizing Unstructured Data

In this paper we consider the problem of anonymizing datasets in which each individual is associated with a set of items that constitute private information about the individual. Illustrative datasets include market-basket datasets and…

Databases · Computer Science 2008-11-04 Rajeev Motwani , Shubha U. Nabar

The Boundary Between Privacy and Utility in Data Anonymization

We consider the privacy problem in data publishing: given a relation I containing sensitive information 'anonymize' it to obtain a view V such that, on one hand attackers cannot learn any sensitive information from V, and on the other hand…

Databases · Computer Science 2007-05-23 Vibhor Rastogi , Dan Suciu , Sungho Hong

On the k-Anonymization of Time-varying and Multi-layer Social Graphs

The popularity of online social media platforms provides an unprecedented opportunity to study real-world complex networks of interactions. However, releasing this data to researchers and the public comes at the cost of potentially exposing…

Cryptography and Security · Computer Science 2015-03-24 Luca Rossi , Mirco Musolesi , Andrea Torsello