Related papers: Source Attribution for Large Language Model-Genera…

Watermarking Large Language Models and the Generated Content: Opportunities and Challenges

The widely adopted and powerful generative large language models (LLMs) have raised concerns about intellectual property rights violations and the spread of machine-generated misinformation. Watermarking serves as a promising approch to…

Cryptography and Security · Computer Science 2024-10-28 Ruisi Zhang , Farinaz Koushanfar

Waterfall: Framework for Robust and Scalable Text Watermarking and Provenance for LLMs

Protecting intellectual property (IP) of text such as articles and code is increasingly important, especially as sophisticated attacks become possible, such as paraphrasing by large language models (LLMs) or even unauthorized training of…

Cryptography and Security · Computer Science 2024-10-30 Gregory Kang Ruey Lau , Xinyuan Niu , Hieu Dao , Jiangwei Chen , Chuan-Sheng Foo , Bryan Kian Hsiang Low

Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection

Large language models (LLMs) have demonstrated outstanding performance, making them valuable digital assets with significant commercial potential. Unfortunately, the LLM and its API are susceptible to intellectual property theft.…

Cryptography and Security · Computer Science 2024-07-25 Shuai Li , Kejiang Chen , Kunsheng Tang , Jie Zhang , Weiming Zhang , Nenghai Yu , Kai Zeng

Watermarking LLM-Generated Datasets in Downstream Tasks

Large Language Models (LLMs) have experienced rapid advancements, with applications spanning a wide range of fields, including sentiment classification, review generation, and question answering. Due to their efficiency and versatility,…

Cryptography and Security · Computer Science 2025-06-17 Yugeng Liu , Tianshuo Cong , Michael Backes , Zheng Li , Yang Zhang

DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text

The rapid advancement of Large Language Models (LLMs) has significantly enhanced the capabilities of text generators. With the potential for misuse escalating, the importance of discerning whether texts are human-authored or generated by…

Multimedia · Computer Science 2024-03-12 Travis Munyer , Abdullah Tanvir , Arjon Das , Xin Zhong

Segmenting Watermarked Texts From Language Models

Watermarking is a technique that involves embedding nearly unnoticeable statistical signals within generated content to help trace its source. This work focuses on a scenario where an untrusted third-party user sends prompts to a trusted…

Machine Learning · Computer Science 2024-10-29 Xingchi Li , Guanxun Li , Xianyang Zhang

Topic-Based Watermarks for Large Language Models

The indistinguishability of large language model (LLM) output from human-authored content poses significant challenges, raising concerns about potential misuse of AI-generated text and its influence on future model training. Watermarking…

Cryptography and Security · Computer Science 2026-04-16 Alexander Nemecek , Yuzhou Jiang , Erman Ayday

Attributable-Watermarking of Speech Generative Models

Generative models are now capable of synthesizing images, speeches, and videos that are hardly distinguishable from authentic contents. Such capabilities cause concerns such as malicious impersonation and IP theft. This paper investigates a…

Sound · Computer Science 2022-03-16 Yongbaek Cho , Changhoon Kim , Yezhou Yang , Yi Ren

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

We present REMARK-LLM, a novel efficient, and robust watermarking framework designed for texts generated by large language models (LLMs). Synthesizing human-like content using LLMs necessitates vast computational resources and extensive…

Cryptography and Security · Computer Science 2024-04-09 Ruisi Zhang , Shehzeen Samarah Hussain , Paarth Neekhara , Farinaz Koushanfar

Building Intelligence Identification System via Large Language Model Watermarking: A Survey and Beyond

Large Language Models (LLMs) are increasingly integrated into diverse industries, posing substantial security risks due to unauthorized replication and misuse. To mitigate these concerns, robust identification mechanisms are widely…

Cryptography and Security · Computer Science 2024-07-25 Xuhong Wang , Haoyu Jiang , Yi Yu , Jingru Yu , Yilun Lin , Ping Yi , Yingchun Wang , Yu Qiao , Li Li , Fei-Yue Wang

From Text to Source: Results in Detecting Large Language Model-Generated Content

The widespread use of Large Language Models (LLMs), celebrated for their ability to generate human-like text, has raised concerns about misinformation and ethical implications. Addressing these concerns necessitates the development of…

Computation and Language · Computer Science 2024-03-28 Wissam Antoun , Benoît Sagot , Djamé Seddah

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

Large Language Models (LLMs) have demonstrated impressive capabilities in generating diverse and contextually rich text. However, concerns regarding copyright infringement arise as LLMs may inadvertently produce copyrighted material. In…

Machine Learning · Computer Science 2025-06-06 Michael-Andrei Panaitescu-Liess , Zora Che , Bang An , Yuancheng Xu , Pankayaraj Pathmanathan , Souradip Chakraborty , Sicheng Zhu , Tom Goldstein , Furong Huang

Provably Robust Multi-bit Watermarking for AI-generated Text

Large Language Models (LLMs) have demonstrated remarkable capabilities of generating texts resembling human language. However, they can be misused by criminals to create deceptive content, such as fake news and phishing emails, which raises…

Cryptography and Security · Computer Science 2025-01-29 Wenjie Qu , Wengrui Zheng , Tianyang Tao , Dong Yin , Yanze Jiang , Zhihua Tian , Wei Zou , Jinyuan Jia , Jiaheng Zhang

CLASP: Training-Free LLM-Assisted Source Code Watermarking via Semantic-Preserving Transformations

The proliferation of open-source code and large language models (LLMs) for code generation has amplified the risks of unauthorized reuse and intellectual property infringement. Source code watermarking offers a potential solution, yet…

Cryptography and Security · Computer Science 2026-04-21 Rui Xu , Jiawei Chen , Weizhi Liu , Zhaoxia Yin , Cong Kong , Xinpeng Zhang

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content

This paper presents a survey and taxonomy of LLM fingerprinting and watermarking for identity, ownership verification, provenance, and generated-content attribution. Large language models (LLMs) require substantial investments in data,…

Cryptography and Security · Computer Science 2026-05-29 Bing Liu , Shunping Wang , Yufan Zhu , Xinyi Yu , Jing Huang , Linkang Du , Hongbin Pei , Wei Luo

DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models

As large language models (LLMs) grow more powerful, concerns over copyright infringement of LLM-generated texts have intensified. LLM watermarking has been proposed to trace unauthorized redistribution or resale of generated content by…

Cryptography and Security · Computer Science 2025-08-05 Qihao Lin , Chen Tang , Lan zhang , Junyang zhang , Xiangyang Li

Watermarking Techniques for Large Language Models: A Survey

With the rapid advancement and extensive application of artificial intelligence technology, large language models (LLMs) are extensively used to enhance production, creativity, learning, and work efficiency across various domains. However,…

Cryptography and Security · Computer Science 2024-09-04 Yuqing Liang , Jiancheng Xiao , Wensheng Gan , Philip S. Yu

LLM Watermarking Using Mixtures and Statistical-to-Computational Gaps

Given a text, can we determine whether it was generated by a large language model (LLM) or by a human? A widely studied approach to this problem is watermarking. We propose an undetectable and elementary watermarking scheme in the closed…

Cryptography and Security · Computer Science 2025-06-26 Pedro Abdalla , Roman Vershynin

Adaptive Text Watermark for Large Language Models

The advancement of Large Language Models (LLMs) has led to increasing concerns about the misuse of AI-generated text, and watermarking for LLM-generated text has emerged as a potential solution. However, it is challenging to generate…

Computation and Language · Computer Science 2024-06-11 Yepeng Liu , Yuheng Bu

Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks

In the present-day scenario, Large Language Models (LLMs) are establishing their presence as powerful instruments permeating various sectors of society. While their utility offers valuable support to individuals, there are multiple concerns…

Computation and Language · Computer Science 2025-07-01 Badr Youbi Idrissi , Monica Millunzi , Amelia Sorrenti , Lorenzo Baraldi , Daryna Dementieva