Related papers: The $k$-anonymity Problem is Hard
The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization that has been recently proposed is the $k$-anonymity. This approach requires that the rows of a table are…
We study the problem of anonymizing tables containing personal information before releasing them for public use. One of the formulations considered in this context is the $k$-anonymization problem: given a table, suppress a minimum number…
We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving…
The $k$-center problem is a classical combinatorial optimization problem which asks to find $k$ centers such that the maximum distance of any input point in a set $P$ to its assigned center is minimized. The problem allows for elegant…
An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as…
Motivated by recently discovered privacy attacks on social networks, we study the problem of anonymizing the underlying graph of interactions in a social network. We call a graph (k,l)-anonymous if for every node in the graph there exist at…
In this paper, we cast the classic problem of achieving k-anonymity for a given database as a problem in algebraic topology. Using techniques from this field of mathematics, we propose a framework for k-anonymity that brings new insights…
Data privacy and anonymisation are critical concerns in today's data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation…
We study the problem of abstracting a table of data about individuals so that no selection query can identify fewer than k individuals. We show that it is impossible to achieve arbitrarily good polynomial-time approximations for a number of…
We study $k$-clustering problems with lower bounds, including $k$-median and $k$-means clustering with lower bounds. In addition to the point set $P$ and the number of centers $k$, a $k$-clustering problem with (uniform) lower bounds gets a…
The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of…
The existing solutions to privacy preserving publication can be classified into the theoretical and heuristic categories. The former guarantees provably low information loss, whereas the latter incurs gigantic loss in the worst case, but is…
We focus on two mainstream privacy models: k-anonymity and differential privacy. Once a privacy model has been selected, the goal is to enforce it while preserving as much data utility as possible. The main objective of this thesis is to…
Huge volume of data from domain specific applications such as medical, financial, telephone, shopping records and individuals are regularly generated. Sharing of these data is proved to be beneficial for data mining application. Since data…
This paper aims at answering the following two questions in privacy-preserving data analysis and publishing: What formal privacy guarantee (if any) does $k$-anonymization provide? How to benefit from the adversary's uncertainty about the…
Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate $k$-anonymous data sets,…
Consider a database of $n$ people, each represented by a bit-string of length $d$ corresponding to the setting of $d$ binary attributes. A $k$-way marginal query is specified by a subset $S$ of $k$ attributes, and a $|S|$-dimensional binary…
In this paper we consider the problem of anonymizing datasets in which each individual is associated with a set of items that constitute private information about the individual. Illustrative datasets include market-basket datasets and…
We consider the privacy problem in data publishing: given a relation I containing sensitive information 'anonymize' it to obtain a view V such that, on one hand attackers cannot learn any sensitive information from V, and on the other hand…
The popularity of online social media platforms provides an unprecedented opportunity to study real-world complex networks of interactions. However, releasing this data to researchers and the public comes at the cost of potentially exposing…