Related papers: A Sensitive Attribute based Clustering Method for …

A Review of Anonymization for Healthcare Data

Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, reduced cost, and it drives innovative solutions within the healthcare sector. However, health data is highly sensitive…

Cryptography and Security · Computer Science 2022-04-28 Iyiola E. Olatunji , Jens Rauch , Matthias Katzensteiner , Megha Khosla

Clustering based Privacy Preserving of Big Data using Fuzzification and Anonymization Operation

Big Data is used by data miner for analysis purpose which may contain sensitive information. During the procedures it raises certain privacy challenges for researchers. The existing privacy preserving methods use different algorithms that…

Databases · Computer Science 2020-01-07 Saira Khan , Khalid Iqbal , Safi Faizullah , Muhammad Fahad , Jawad Ali , Waqas Ahmed

A Multi-level Clustering Approach for Anonymizing Large-Scale Physical Activity Data

Publishing physical activity data can facilitate reproducible health-care research in several areas such as population health management, behavioral health research, and management of chronic health problems. However, publishing such data…

Cryptography and Security · Computer Science 2019-08-22 Pooja Parameshwarappa , Zhiyuan Chen , Gunes Koru

Privacy Gain Based Multi-Iterative k-Anonymization to Protect Respondents Privacy

Huge volume of data from domain specific applications such as medical, financial, telephone, shopping records and individuals are regularly generated. Sharing of these data is proved to be beneficial for data mining application. Since data…

Methodology · Statistics 2014-03-21 Hitesh Chhinkaniwala , Sanjay Garg

Diversifying Anonymized Data with Diversity Constraints

Recently introduced privacy legislation has aimed to restrict and control the amount of personal data published by companies and shared to third parties. Much of this real data is not only sensitive requiring anonymization, but also…

Databases · Computer Science 2020-07-20 Mostafa Milani , Yu Huang , Fei Chiang

Finding the Sweet Spot for Data Anonymization: A Mechanism Design Perspective

Data sharing between different organizations is an essential process in today's connected world. However, recently there were many concerns about data sharing as sharing sensitive information can jeopardize users' privacy. To preserve the…

Computer Science and Game Theory · Computer Science 2021-02-01 Abdelrahman Eldosouky , Tapadhir Das , Anuraag Kotra , Shamik Sengupta

Multi-Objective Optimization-Based Anonymization of Structured Data for Machine Learning Application

Organizations are collecting vast amounts of data, but they often lack the capabilities needed to fully extract insights. As a result, they increasingly share data with external experts, such as analysts or researchers, to gain value from…

Machine Learning · Computer Science 2025-05-16 Yusi Wei , Hande Y. Benson , Joseph K. Agor , Muge Capan

Hybrid Microaggregation for Privacy-Preserving Data Mining

k-Anonymity by microaggregation is one of the most commonly used anonymization techniques. This success is owe to the achievement of a worth of interest tradeoff between information loss and identity disclosure risk. However, this method…

Cryptography and Security · Computer Science 2018-12-06 Balkis Abidi , Sadok Ben Yahia , Charith Perera

An Effective Clustering Approach to Web Query Log Anonymization

Web query log data contain information useful to research; however, release of such data can re-identify the search engine users issuing the queries. These privacy concerns go far beyond removing explicitly identifying information such as…

Databases · Computer Science 2010-12-06 Amin Milani Fard , Ke Wang

Cloaked Classifiers: Pseudonymization Strategies on Sensitive Classification Tasks

Protecting privacy is essential when sharing data, particularly in the case of an online radicalization dataset that may contain personal information. In this paper, we explore the balance between preserving data usefulness and ensuring…

Computation and Language · Computer Science 2024-06-27 Arij Riabi , Menel Mahamdi , Virginie Mouilleron , Djamé Seddah

Anonymization with Worst-Case Distribution-Based Background Knowledge

Background knowledge is an important factor in privacy preserving data publishing. Distribution-based background knowledge is one of the well studied background knowledge. However, to the best of our knowledge, there is no existing work…

Databases · Computer Science 2009-09-08 Raymond Chi-Wing Wong , Ada Wai-Chee Fu , Ke Wang , Yabo Xu , Jian Pei , Philip S. Yu

Can the Utility of Anonymized Data be used for Privacy Breaches?

Group based anonymization is the most widely studied approach for privacy preserving data publishing. This includes k-anonymity, l-diversity, and t-closeness, to name a few. The goal of this paper is to raise a fundamental issue on the…

Databases · Computer Science 2009-05-13 Raymond Chi-Wing Wong , Ada Wai-Chee Fu , Ke Wang , Yabo Xu , Philip S. Yu

An Abstract View on the De-anonymization Process

Over the recent years, the availability of datasets containing personal, but anonymized information has been continuously increasing. Extensive research has revealed that such datasets are vulnerable to privacy breaches: being able to…

Cryptography and Security · Computer Science 2019-02-27 Alexandros Bampoulidis , Mihai Lupu

Tuple Value Based Multiplicative Data Perturbation Approach To Preserve Privacy In Data Stream Mining

Huge volume of data from domain specific applications such as medical, financial, library, telephone, shopping records and individual are regularly generated. Sharing of these data is proved to be beneficial for data mining application. On…

Databases · Computer Science 2013-06-07 Hitesh Chhinkaniwala , Sanjay Garg

Which anonymization technique is best for which NLP task? -- It depends. A Systematic Study on Clinical Text Processing

Clinical text processing has gained more and more attention in recent years. The access to sensitive patient data, on the other hand, is still a big challenge, as text cannot be shared without legal hurdles and without removing personal…

Computation and Language · Computer Science 2022-09-02 Iyadh Ben Cheikh Larbi , Aljoscha Burchardt , Roland Roller

Group-Based Privacy Preservation Techniques for Process Mining

Process mining techniques help to improve processes using event data. Such data are widely available in information systems. However, they often contain highly sensitive information. For example, healthcare information systems record event…

Databases · Computer Science 2021-05-26 Majid Rafiei , Wil M. P. van der Aalst

Distribution-Preserving k-Anonymity

Preserving the privacy of individuals by protecting their sensitive attributes is an important consideration during microdata release. However, it is equally important to preserve the quality or utility of the data for at least some…

Machine Learning · Statistics 2017-11-07 Dennis Wei , Karthikeyan Natesan Ramamurthy , Kush R. Varshney

A Multi-Objective Degree-Based Network Anonymization Approach

Enormous amounts of data collected from social networks or other online platforms are being published for the sake of statistics, marketing, and research, among other objectives. The consequent privacy and data security concerns have…

Cryptography and Security · Computer Science 2021-12-24 Ola N. Halawi , Faisal N. Abu-Khzam

On the Evaluation of the Privacy Breach in Disassociated Set-Valued Datasets

Data anonymization is gaining much attention these days as it provides the fundamental requirements to safely outsource datasets containing identifying information. While some techniques add noise to protect privacy others use…

Cryptography and Security · Computer Science 2016-11-28 Sara Barakat , Bechara Al Bouna , Mohamed Nassar , Christophe Guyeux

Contrained Generalization For Data Anonymization - A Systematic Search Based Approach

Data generalization is a powerful technique for sanitizing multi-attribute data for publication. In a multidimensional model, a subset of attributes called the quasi-identifiers (QI) are used to define the space and a generalization scheme…

Databases · Computer Science 2021-08-12 Bijit Hore , Ravi Jammalamadaka , Sharad Mehrotra , Amedeo D'Ascanio