Related papers: Anonymizing Unstructured Data

An Effective Clustering Approach to Web Query Log Anonymization

Web query log data contain information useful to research; however, release of such data can re-identify the search engine users issuing the queries. These privacy concerns go far beyond removing explicitly identifying information such as…

Databases · Computer Science 2010-12-06 Amin Milani Fard , Ke Wang

($k$,$\epsilon$)-Anonymity: $k$-Anonymity with $\epsilon$-Differential Privacy

The explosion in volume and variety of data offers enormous potential for research and commercial use. Increased availability of personal data is of particular interest in enabling highly customised services tuned to individual needs.…

Cryptography and Security · Computer Science 2017-10-05 Naoise Holohan , Spiros Antonatos , Stefano Braghin , Pól Mac Aonghusa

On the Evaluation of the Privacy Breach in Disassociated Set-Valued Datasets

Data anonymization is gaining much attention these days as it provides the fundamental requirements to safely outsource datasets containing identifying information. While some techniques add noise to protect privacy others use…

Cryptography and Security · Computer Science 2016-11-28 Sara Barakat , Bechara Al Bouna , Mohamed Nassar , Christophe Guyeux

SKALD: Scalable K-Anonymisation for Large Datasets

Data privacy and anonymisation are critical concerns in today's data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation…

Information Theory · Computer Science 2025-07-02 Kailash Reddy , Novoneel Chakraborty , Amogh Dharmavaram , Anshoo Tandon

An Algebraic Topological Approach to Privacy: Numerical and Categorical Data

In this paper, we cast the classic problem of achieving k-anonymity for a given database as a problem in algebraic topology. Using techniques from this field of mathematics, we propose a framework for k-anonymity that brings new insights…

Databases · Computer Science 2016-02-23 Alberto Speranzon , Shaunak D. Bopardikar

Anonymizing Machine Learning Models

There is a known tension between the need to analyze personal data to drive business and privacy concerns. Many data protection regulations, including the EU General Data Protection Regulation (GDPR) and the California Consumer Protection…

Cryptography and Security · Computer Science 2022-02-02 Abigail Goldsteen , Gilad Ezov , Ron Shmelkin , Micha Moffie , Ariel Farkash

$k$-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers

The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of…

Machine Learning · Computer Science 2022-06-23 Djordje Slijepčević , Maximilian Henzl , Lukas Daniel Klausner , Tobias Dam , Peter Kieseberg , Matthias Zeppelzauer

The $k$-anonymity Problem is Hard

The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization recently proposed is the k-anonymity. This approach requires that the rows in a table are clustered in sets…

Databases · Computer Science 2009-06-02 Paola Bonizzoni , Gianluca Della Vedova , Riccardo Dondi

Can the Utility of Anonymized Data be used for Privacy Breaches?

Group based anonymization is the most widely studied approach for privacy preserving data publishing. This includes k-anonymity, l-diversity, and t-closeness, to name a few. The goal of this paper is to raise a fundamental issue on the…

Databases · Computer Science 2009-05-13 Raymond Chi-Wing Wong , Ada Wai-Chee Fu , Ke Wang , Yabo Xu , Philip S. Yu

Comparison of machine learning models applied on anonymized data with different techniques

Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is…

Machine Learning · Computer Science 2023-05-15 Judith Sáinz-Pardo Díaz , Álvaro López García

Secure k-Anonymization over Encrypted Databases

Data protection algorithms are becoming increasingly important to support modern business needs for facilitating data sharing and data monetization. Anonymization is an important step before data sharing. Several organizations leverage on…

Cryptography and Security · Computer Science 2021-08-11 Manish Kesarwani , Akshar Kaul , Stefano Braghin , Naoise Holohan , Spiros Antonatos

Anonymizing k-Facial Attributes via Adversarial Perturbations

A face image not only provides details about the identity of a subject but also reveals several attributes such as gender, race, sexual orientation, and age. Advancements in machine learning algorithms and popularity of sharing images on…

Computer Vision and Pattern Recognition · Computer Science 2018-10-01 Saheb Chhabra , Richa Singh , Mayank Vatsa , Gaurav Gupta

Learning from Anonymized and Incomplete Tabular Data

User-driven privacy allows individuals to control whether and at what granularity their data is shared, leading to datasets that mix original, generalized, and missing values within the same records and attributes. While such…

Machine Learning · Computer Science 2026-02-03 Lucas Lange , Adrian Böttinger , Victor Christen , Anushka Vidanage , Peter Christen , Erhard Rahm

Assessing the risk of re-identification arising from an attack on anonymised data

Objective: The use of routinely-acquired medical data for research purposes requires the protection of patient confidentiality via data anonymisation. The objective of this work is to calculate the risk of re-identification arising from a…

Machine Learning · Computer Science 2022-04-01 Anna Antoniou , Giacomo Dossena , Julia MacMillan , Steven Hamblin , David Clifton , Paula Petrone

On the Privacy of Optimization Approaches

Ensuring privacy of sensitive data is essential in many contexts, such as healthcare data, banks, e-commerce, wireless sensor networks, and social networks. It is common that different entities coordinate or want to rely on a third party to…

Cryptography and Security · Computer Science 2014-06-16 Pradeep Chathuranga Weeraddana , George Athanasiou , Martin Jakobsson , Carlo Fischione , John S. Baras

A Revised Classification of Anonymity

This paper primarily addresses the issue of identifying all possible levels of digital anonymity, thereby allowing electronic services and mechanisms to be categorised. For this purpose, we sophisticate the generic idea of anonymity and,…

Cryptography and Security · Computer Science 2012-11-27 Peter Pleva

Smooth Anonymity for Sparse Graphs

When working with user data providing well-defined privacy guarantees is paramount. In this work, we aim to manipulate and share an entire sparse dataset with a third party privately. In fact, differential privacy has emerged as the gold…

Cryptography and Security · Computer Science 2024-05-16 Alessandro Epasto , Hossein Esfandiari , Vahab Mirrokni , Andres Munoz Medina

Statistical anonymity: Quantifying reidentification risks without reidentifying users

Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification, and it is an important alternative to differential privacy in applications that cannot tolerate noisy data. Existing…

Data Structures and Algorithms · Computer Science 2022-01-31 Gecia Bravo-Hermsdorff , Robert Busa-Fekete , Lee M. Gunderson , Andrés Munõz Medina , Umar Syed

Finding the Sweet Spot for Data Anonymization: A Mechanism Design Perspective

Data sharing between different organizations is an essential process in today's connected world. However, recently there were many concerns about data sharing as sharing sensitive information can jeopardize users' privacy. To preserve the…

Computer Science and Game Theory · Computer Science 2021-02-01 Abdelrahman Eldosouky , Tapadhir Das , Anuraag Kotra , Shamik Sengupta

A Sensitive Attribute based Clustering Method for kanonymization

In medical organizations large amount of personal data are collected and analyzed by the data miner or researcher, for further perusal. However, the data collected may contain sensitive information such as specific disease of a patient and…

Cryptography and Security · Computer Science 2012-03-19 Pawan R Bhaladhare , Devesh Jinwala