Related papers: JAMDEC: Unsupervised Authorship Obfuscation using …

Keep It Private: Unsupervised Privatization of Online Text

Authorship obfuscation techniques hold the promise of helping people protect their privacy in online communications by automatically rewriting text to hide the identity of the original author. However, obfuscation has been evaluated in…

Computation and Language · Computer Science 2024-05-17 Calvin Bao , Marine Carpuat

Personalized Author Obfuscation with Large Language Models

In this paper, we investigate the efficacy of large language models (LLMs) in obfuscating authorship by paraphrasing and altering writing styles. Rather than adopting a holistic approach that evaluates performance across the entire dataset,…

Computation and Language · Computer Science 2025-05-20 Mohammad Shokri , Sarah Ita Levitan , Rivka Levitan

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring…

Computation and Language · Computer Science 2024-08-29 Jillian Fisher , Skyler Hallinan , Ximing Lu , Mitchell Gordon , Zaid Harchaoui , Yejin Choi

AIDBench: A benchmark for evaluating the authorship identification capability of large language models

As large language models (LLMs) rapidly advance and integrate into daily life, the privacy risks they pose are attracting increasing attention. We focus on a specific privacy risk where LLMs may help identify the authorship of anonymous…

Computation and Language · Computer Science 2024-11-21 Zichen Wen , Dadi Guo , Huishuai Zhang

Masks and Mimicry: Strategic Obfuscation and Impersonation Attacks on Authorship Verification

The increasing use of Artificial Intelligence (AI) technologies, such as Large Language Models (LLMs) has led to nontrivial improvements in various tasks, including accurate authorship identification of documents. However, while LLMs…

Computation and Language · Computer Science 2025-03-26 Kenneth Alperin , Rohan Leekha , Adaku Uchendu , Trang Nguyen , Srilakshmi Medarametla , Carlos Levya Capote , Seth Aycock , Charlie Dagli

A Girl Has A Name: Detecting Authorship Obfuscation

Authorship attribution aims to identify the author of a text based on the stylometric analysis. Authorship obfuscation, on the other hand, aims to protect against authorship attribution by modifying a text's style. In this paper, we…

Computation and Language · Computer Science 2020-05-05 Asad Mahmood , Zubair Shafiq , Padmini Srinivasan

TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and…

Computation and Language · Computer Science 2025-03-19 Gabriel Loiseau , Damien Sileo , Damien Riquet , Maxime Meyer , Marc Tommasi

TRAPDOC: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents

The reasoning, writing, text-editing, and retrieval capabilities of proprietary large language models (LLMs) have advanced rapidly, providing users with an ever-expanding set of functionalities. However, this growing utility has also led to…

Computers and Society · Computer Science 2025-09-30 Hyundong Jin , Sicheol Sung , Shinwoo Park , SeungYeop Baik , Yo-Sub Han

SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs

Large language models sometimes inadvertently reproduce passages that are copyrighted, exposing downstream applications to legal risk. Most existing studies for inference-time defences focus on surface-level token matching and rely on…

Computation and Language · Computer Science 2025-11-12 Zhenliang Zhang , Xinyu Hu , Xiaojun Wan

Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification

Recent advancements in large language models (LLMs) have been fueled by large scale training corpora drawn from diverse sources such as websites, news articles, and books. These datasets often contain explicit user information, such as…

Computation and Language · Computer Science 2025-05-21 Tuc Nguyen , Yifan Hu , Thai Le

Obfuscation for Privacy-preserving Syntactic Parsing

The goal of homomorphic encryption is to encrypt data such that another party can operate on it without being explicitly exposed to the content of the original data. We introduce an idea for a privacy-preserving transformation on natural…

Computation and Language · Computer Science 2020-05-28 Zhifeng Hu , Serhii Havrylov , Ivan Titov , Shay B. Cohen

Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Modern large language model (LLM) inference engines optimize throughput and latency under fixed decoding rules, treating generation as a linear progression in token time. We propose a fundamentally different paradigm: entropic\-time…

Computation and Language · Computer Science 2026-03-05 Andrew Kiruluta

JsDeObsBench: Measuring and Benchmarking LLMs for JavaScript Deobfuscation

Deobfuscating JavaScript (JS) code poses a significant challenge in web security, particularly as obfuscation techniques are frequently used to conceal malicious activities within scripts. While Large Language Models (LLMs) have recently…

Cryptography and Security · Computer Science 2025-06-26 Guoqiang Chen , Xin Jin , Zhiqiang Lin

Latent Diffusion Models for Attribute-Preserving Image Anonymization

Generative techniques for image anonymization have great potential to generate datasets that protect the privacy of those depicted in the images, while achieving high data fidelity and utility. Existing methods have focused extensively on…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Luca Piano , Pietro Basci , Fabrizio Lamberti , Lia Morra

The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations

Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to…

Computation and Language · Computer Science 2021-02-25 Jeremiah Duncan , Fabian Fallas , Chris Gropp , Emily Herron , Maria Mahbub , Paula Olaya , Eduardo Ponce , Tabitha K. Samuel , Daniel Schultz , Sudarshan Srinivasan , Maofeng Tang , Viktor Zenkov , Quan Zhou , Edmon Begoli

AgentStealth: Reinforcing Large Language Model for Anonymizing User-generated Text

In today's digital world, casual user-generated content often contains subtle cues that may inadvertently expose sensitive personal attributes. Such risks underscore the growing importance of effective text anonymization to safeguard…

Computation and Language · Computer Science 2025-07-01 Chenyang Shao , Tianxing Li , Chenhao Pu , Fengli Xu , Yong Li

Prompt Obfuscation for Large Language Models

System prompts that include detailed instructions to describe the task performed by the underlying LLM can easily transform foundation models into tools and services with minimal overhead. They are often considered intellectual property,…

Cryptography and Security · Computer Science 2025-08-07 David Pape , Sina Mavali , Thorsten Eisenhofer , Lea Schönherr

InferDPT: Privacy-Preserving Inference for Closed-box Large Language Model

Large language models (LLMs), like ChatGPT, have greatly simplified text generation tasks. However, they have also raised concerns about privacy risks such as data leakage and unauthorized data collection. Existing solutions for…

Cryptography and Security · Computer Science 2026-03-19 Meng Tong , Kejiang Chen , Jie Zhang , Yuang Qi , Weiming Zhang , Nenghai Yu , Tianwei Zhang , Zhikun Zhang

Can Large Language Models Identify Authorship?

The ability to accurately identify authorship is crucial for verifying content authenticity and mitigating misinformation. Large Language Models (LLMs) have demonstrated an exceptional capacity for reasoning and problem-solving. However,…

Computation and Language · Computer Science 2024-10-23 Baixiang Huang , Canyu Chen , Kai Shu

RL-Finetuned LLMs for Privacy-Preserving Synthetic Rewriting

The performance of modern machine learning systems depends on access to large, high-quality datasets, often sourced from user-generated content or proprietary, domain-specific corpora. However, these rich datasets inherently contain…

Cryptography and Security · Computer Science 2025-08-28 Zhan Shi , Yefeng Yuan , Yuhong Liu , Liang Cheng , Yi Fang