Related papers: TAROT: Task-Oriented Authorship Obfuscation Using …

Keep It Private: Unsupervised Privatization of Online Text

Authorship obfuscation techniques hold the promise of helping people protect their privacy in online communications by automatically rewriting text to hide the identity of the original author. However, obfuscation has been evaluated in…

Computation and Language · Computer Science 2024-05-17 Calvin Bao , Marine Carpuat

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing…

Computation and Language · Computer Science 2026-04-21 Gabriel Loiseau , Damien Sileo , Damien Riquet , Maxime Meyer , Marc Tommasi

A Girl Has A Name: Detecting Authorship Obfuscation

Authorship attribution aims to identify the author of a text based on the stylometric analysis. Authorship obfuscation, on the other hand, aims to protect against authorship attribution by modifying a text's style. In this paper, we…

Computation and Language · Computer Science 2020-05-05 Asad Mahmood , Zubair Shafiq , Padmini Srinivasan

Author Obfuscation Using Generalised Differential Privacy

The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary's auxiliary knowledge which makes it difficult to…

Cryptography and Security · Computer Science 2018-05-24 Natasha Fernandes , Mark Dras , Annabelle McIver

The Privacy-Utility Tradeoff in Rank-Preserving Dataset Obfuscation

Dataset obfuscation refers to techniques in which random noise is added to the entries of a given dataset, prior to its public release, to protect against leakage of private information. In this work, dataset obfuscation under two…

Information Theory · Computer Science 2023-05-15 Mahshad Shariatnasab , Farhad Shirani , S. Sitharma Iyengar

UID as a Guiding Metric for Automated Authorship Obfuscation

Protecting the anonymity of authors has become a difficult task given the rise of automated authorship attributors. These attributors are capable of attributing the author of a text amongst a pool of authors with great accuracy. In order to…

Computation and Language · Computer Science 2023-12-08 Nicholas Abegg

ALISON: Fast and Effective Stylometric Authorship Obfuscation

Authorship Attribution (AA) and Authorship Obfuscation (AO) are two competing tasks of increasing importance in privacy research. Modern AA leverages an author's consistent writing style to match a text to its author using an AA classifier.…

Computation and Language · Computer Science 2024-02-02 Eric Xing , Saranya Venkatraman , Thai Le , Dongwon Lee

A Girl Has A Name, And It's ... Adversarial Authorship Attribution for Deobfuscation

Recent advances in natural language processing have enabled powerful privacy-invasive authorship attribution. To counter authorship attribution, researchers have proposed a variety of rule-based and learning-based text obfuscation…

Computation and Language · Computer Science 2022-03-23 Wanyue Zhai , Jonathan Rusert , Zubair Shafiq , Padmini Srinivasan

Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

Two interlocking research questions of growing interest and importance in privacy research are Authorship Attribution (AA) and Authorship Obfuscation (AO). Given an artifact, especially a text t in question, an AA solution aims to…

Computation and Language · Computer Science 2023-03-14 Adaku Uchendu , Thai Le , Dongwon Lee

Privacy Games: Optimal User-Centric Data Obfuscation

In this paper, we design user-centric obfuscation mechanisms that impose the minimum utility loss for guaranteeing user's privacy. We optimize utility subject to a joint guarantee of differential privacy (indistinguishability) and…

Cryptography and Security · Computer Science 2015-05-29 Reza Shokri

Trade-offs and Guarantees of Adversarial Representation Learning for Information Obfuscation

Crowdsourced data used in machine learning services might carry sensitive information about attributes that users do not want to share. Various methods have been proposed to minimize the potential information leakage of sensitive attributes…

Machine Learning · Computer Science 2020-10-27 Han Zhao , Jianfeng Chi , Yuan Tian , Geoffrey J. Gordon

Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness

Text style can reveal sensitive attributes of the author (e.g. race or age) to the reader, which can, in turn, lead to privacy violations and bias in both human and algorithmic decisions based on text. For example, the style of writing in…

Machine Learning · Computer Science 2021-09-13 Fatemehsadat Mireshghallah , Taylor Berg-Kirkpatrick

Obfuscation for Privacy-preserving Syntactic Parsing

The goal of homomorphic encryption is to encrypt data such that another party can operate on it without being explicitly exposed to the content of the original data. We introduce an idea for a privacy-preserving transformation on natural…

Computation and Language · Computer Science 2020-05-28 Zhifeng Hu , Serhii Havrylov , Ivan Titov , Shay B. Cohen

Improving Suppression to Reduce Disclosure Risk and Enhance Data Utility

In Privacy Preserving Data Publishing, various privacy models have been developed for employing anonymization operations on sensitive individual level datasets, in order to publish the data for public access while preserving the privacy of…

Databases · Computer Science 2019-01-09 Marmar Orooji , Gerald M. Knapp

Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text

With Internet users constantly leaving a trail of text, whether through blogs, emails, or social media posts, the ability to write and protest anonymously is being eroded because artificial intelligence, when given a sample of previous…

Machine Learning · Computer Science 2021-10-19 Rishi Balakrishnan , Stephen Sloan , Anil Aswani

The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation

Users posting online expect to remain anonymous unless they have logged in, which is often needed for them to be able to discuss freely on various topics. Preserving the anonymity of a text's writer can be also important in some other…

Computation and Language · Computer Science 2017-07-31 Georgi Karadjov , Tsvetomila Mihaylova , Yasen Kiprov , Georgi Georgiev , Ivan Koychev , Preslav Nakov

Masks and Mimicry: Strategic Obfuscation and Impersonation Attacks on Authorship Verification

The increasing use of Artificial Intelligence (AI) technologies, such as Large Language Models (LLMs) has led to nontrivial improvements in various tasks, including accurate authorship identification of documents. However, while LLMs…

Computation and Language · Computer Science 2025-03-26 Kenneth Alperin , Rohan Leekha , Adaku Uchendu , Trang Nguyen , Srilakshmi Medarametla , Carlos Levya Capote , Seth Aycock , Charlie Dagli

Personalized Author Obfuscation with Large Language Models

In this paper, we investigate the efficacy of large language models (LLMs) in obfuscating authorship by paraphrasing and altering writing styles. Rather than adopting a holistic approach that evaluates performance across the entire dataset,…

Computation and Language · Computer Science 2025-05-20 Mohammad Shokri , Sarah Ita Levitan , Rivka Levitan

Utility/Privacy Trade-off through the lens of Optimal Transport

Strategic information is valuable either by remaining private (for instance if it is sensitive) or, on the other hand, by being used publicly to increase some utility. These two objectives are antagonistic and leaking this information might…

Machine Learning · Statistics 2020-03-03 Etienne Boursier , Vianney Perchet

Optimizing Privacy and Utility Tradeoffs for Group Interests Through Harmonization

We propose a novel problem formulation to address the privacy-utility tradeoff, specifically when dealing with two distinct user groups characterized by unique sets of private and utility attributes. Unlike previous studies that primarily…

Machine Learning · Computer Science 2024-09-12 Bishwas Mandal , George Amariucai , Shuangqing Wei