English
Related papers

Related papers: An Open Source Python Library for Anonymizing Sens…

200 papers

Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of…

Cryptography and Security · Computer Science 2023-05-15 Judith Sáinz-Pardo Díaz , Álvaro López García

Recently introduced privacy legislation has aimed to restrict and control the amount of personal data published by companies and shared to third parties. Much of this real data is not only sensitive requiring anonymization, but also…

Databases · Computer Science 2020-07-20 Mostafa Milani , Yu Huang , Fei Chiang

In this document, we present a state of the art of anonymization techniques for classical tabular datasets. This article is geared towards a general public having some knowledge of mathematics and computer science, but with no need for…

Cryptography and Security · Computer Science 2020-01-09 Benjamin Nguyen , Claude Castelluccia

Anonymization is a foundational principle of data privacy regulation, yet its practical application remains riddled with ambiguity and inconsistency. This paper introduces the concept of anonymity-washing -- the misrepresentation of the…

Cryptography and Security · Computer Science 2025-08-27 Szivia Lestyán , William Letrone , Ludovica Robustelli , Gergely Biczók

This work investigates the effectiveness of different pseudonymization techniques, ranging from rule-based substitutions to using pre-trained Large Language Models (LLMs), on a variety of datasets and models used for two widely used NLP…

Computation and Language · Computer Science 2023-06-12 Oleksandr Yermilov , Vipul Raheja , Artem Chernodub

In medical organizations large amount of personal data are collected and analyzed by the data miner or researcher, for further perusal. However, the data collected may contain sensitive information such as specific disease of a patient and…

Cryptography and Security · Computer Science 2012-03-19 Pawan R Bhaladhare , Devesh Jinwala

Since its conception in 2006, differential privacy has emerged as the de-facto standard in data privacy, owing to its robust mathematical guarantees, generalised applicability and rich body of literature. Over the years, researchers have…

Cryptography and Security · Computer Science 2019-07-05 Naoise Holohan , Stefano Braghin , Pól Mac Aonghusa , Killian Levacher

Text anonymization is the process of removing or obfuscating information from textual data to protect the privacy of individuals. This process inherently involves a complex trade-off between privacy protection and information preservation,…

Computation and Language · Computer Science 2025-09-23 Gabriel Loiseau , Damien Sileo , Damien Riquet , Maxime Meyer , Marc Tommasi

Publishing person-specific transactions in an anonymous form is increasingly required by organizations. Recent approaches ensure that potentially identifying information (e.g., a set of diagnosis codes) cannot be used to link published…

Databases · Computer Science 2010-01-26 Grigorios Loukides , Aris Gkoulalas-Divanis , Bradley Malin

Data protection algorithms are becoming increasingly important to support modern business needs for facilitating data sharing and data monetization. Anonymization is an important step before data sharing. Several organizations leverage on…

Cryptography and Security · Computer Science 2021-08-11 Manish Kesarwani , Akshar Kaul , Stefano Braghin , Naoise Holohan , Spiros Antonatos

Social networks have become an essential meeting point for millions of individuals willing to publish and consume huge quantities of heterogeneous information. Some studies have shown that the data published in these platforms may contain…

Cryptography and Security · Computer Science 2016-07-05 Alexandre Viejo , David Sánchez

Enormous amounts of data collected from social networks or other online platforms are being published for the sake of statistics, marketing, and research, among other objectives. The consequent privacy and data security concerns have…

Cryptography and Security · Computer Science 2021-12-24 Ola N. Halawi , Faisal N. Abu-Khzam

User-driven privacy allows individuals to control whether and at what granularity their data is shared, leading to datasets that mix original, generalized, and missing values within the same records and attributes. While such…

Machine Learning · Computer Science 2026-02-03 Lucas Lange , Adrian Böttinger , Victor Christen , Anushka Vidanage , Peter Christen , Erhard Rahm

A firm seeks to analyze a dataset and to release the results. The dataset contains information about individual people, and the firm is subject to some regulation that forbids the release of the dataset itself. The regulation also imposes…

Computers and Society · Computer Science 2024-08-28 Aloni Cohen , Micah Altman , Francesca Falzon , Evangelina Anna Markatou , Kobbi Nissim

Numerous generalization techniques have been proposed for privacy preserving data publishing. Most existing techniques, however, implicitly assume that the adversary knows little about the anonymization algorithm adopted by the data…

Databases · Computer Science 2010-03-29 Xiaokui Xiao , Yufei Tao , Nick Koudas

Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, reduced cost, and it drives innovative solutions within the healthcare sector. However, health data is highly sensitive…

Cryptography and Security · Computer Science 2022-04-28 Iyiola E. Olatunji , Jens Rauch , Matthias Katzensteiner , Megha Khosla

Process mining techniques such as process discovery and conformance checking provide insights into actual processes by analyzing event data that are widely available in information systems. These data are very valuable, but often contain…

Cryptography and Security · Computer Science 2020-09-25 Majid Rafiei , Wil M. P. van der Aalst

The recomputability and reproducibility of results from scientific software requires access to both the source code and all associated input and output data. However, the full collection of these resources often does not accompany the key…

Computational Engineering, Finance, and Science · Computer Science 2015-12-24 Christian T. Jacobs , Alexandros Avdis , Gerard J. Gorman , Matthew D. Piggott

We consider the privacy problem in data publishing: given a relation I containing sensitive information 'anonymize' it to obtain a view V such that, on one hand attackers cannot learn any sensitive information from V, and on the other hand…

Databases · Computer Science 2007-05-23 Vibhor Rastogi , Dan Suciu , Sungho Hong

OpenData movement around the globe is demanding more access to information which lies locked in public or private servers. As recently reported by a McKinsey publication, this data has significant economic value, yet its release has…

Databases · Computer Science 2012-05-15 David Leoni
‹ Prev 1 2 3 10 Next ›