Related papers: Knowledge Return Oriented Prompting (KROP)

Prompt-in-Content Attacks: Exploiting Uploaded Inputs to Hijack LLM Behavior

Large Language Models (LLMs) are widely deployed in applications that accept user-submitted content, such as uploaded documents or pasted text, for tasks like summarization and question answering. In this paper, we identify a new class of…

Cryptography and Security · Computer Science 2025-08-28 Zhuotao Lian , Weiyu Wang , Qingkui Zeng , Toru Nakanishi , Teruaki Kitasuka , Chunhua Su

Misleading Large Language Models used (or misused) in Scientific Peer-Reviewing via Hidden Prompt-Injection Attacks

Large Language Models (LLMs) are increasingly being integrated into the scientific peer-review process, raising new questions about their reliability and resilience to manipulation. In this work, we investigate the potential for hidden…

Cryptography and Security · Computer Science 2026-03-31 Matteo Gioele Collu , Umberto Salviati , Roberto Confalonieri , Mauro Conti , Giovanni Apruzzese

Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks by effectively utilizing a prompting strategy. However, they are highly sensitive to input perturbations, such as typographical errors or slight…

Computation and Language · Computer Science 2026-05-27 Lin Mu , Guowei Chu , Li Ni , Lei Sang , Yiwen Zhang

Defending Against Indirect Prompt Injection Attacks With Spotlighting

Large Language Models (LLMs), while powerful, are built and trained to process a single text input. In common applications, multiple inputs can be processed by concatenating them together into a single stream of text. However, the LLM is…

Cryptography and Security · Computer Science 2024-03-25 Keegan Hines , Gary Lopez , Matthew Hall , Federico Zarfati , Yonatan Zunger , Emre Kiciman

How Not to Detect Prompt Injections with an LLM

LLM-integrated applications and agents are vulnerable to prompt injection attacks, where adversaries embed malicious instructions within seemingly benign input data to manipulate the LLM's intended behavior. Recent defenses based on…

Cryptography and Security · Computer Science 2025-12-09 Sarthak Choudhary , Divyam Anshumaan , Nils Palumbo , Somesh Jha

The never ending war in the stack and the reincarnation of ROP attacks

Return Oriented Programming (ROP) is a technique by which an attacker can induce arbitrary behavior inside a vulnerable program without injecting a malicious code. The continues failure of the currently deployed defenses against ROP has…

Cryptography and Security · Computer Science 2020-05-26 Ammari Nader , Joan Calvet , Jose M. Fernandez

CAP: Controllable Alignment Prompting for Unlearning in LLMs

Large language models (LLMs) trained on unfiltered corpora inherently risk retaining sensitive information, necessitating selective knowledge unlearning for regulatory compliance and ethical safety. However, existing parameter-modifying…

Machine Learning · Computer Science 2026-05-18 Zhaokun Wang , Jinyu Guo , Jingwen Pu , Hongli Pu , Meng Yang , Xunlei Chen , Jie Ou , Wenyi Li , Guangchun Luo , Wenhong Tian

Enhancing Adversarial Resistance in LLMs with Recursion

The increasing integration of Large Language Models (LLMs) into society necessitates robust defenses against vulnerabilities from jailbreaking and adversarial prompts. This project proposes a recursive framework for enhancing the resistance…

Cryptography and Security · Computer Science 2024-12-10 Bryan Li , Sounak Bagchi , Zizhan Wang

Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection

Large language models (LLMs) are becoming a popular tool as they have significantly advanced in their capability to tackle a wide range of language-based tasks. However, LLMs applications are highly vulnerable to prompt injection attacks,…

Computation and Language · Computer Science 2024-11-11 Md Abdur Rahman , Fan Wu , Alfredo Cuzzocrea , Sheikh Iqbal Ahamed

Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial…

Cryptography and Security · Computer Science 2023-05-08 Kai Greshake , Sahar Abdelnabi , Shailesh Mishra , Christoph Endres , Thorsten Holz , Mario Fritz

Defense Against Prompt Injection Attack by Leveraging Attack Techniques

With the advancement of technology, large language models (LLMs) have achieved remarkable performance across various natural language processing (NLP) tasks, powering LLM-integrated applications like Microsoft Copilot. However, as LLMs…

Cryptography and Security · Computer Science 2025-08-05 Yulin Chen , Haoran Li , Zihao Zheng , Yangqiu Song , Dekai Wu , Bryan Hooi

Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions

Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models (LLMs), potentially leading to unauthorized actions such as API misuse. Unlike previous approaches that aim to detect…

Cryptography and Security · Computer Science 2025-04-01 Shih-Han Chan

Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition

Large Language Models (LLMs) are deployed in interactive contexts with direct user engagement, such as chatbots and writing assistants. These deployments are vulnerable to prompt injection and jailbreaking (collectively, prompt hacking), in…

Cryptography and Security · Computer Science 2024-03-05 Sander Schulhoff , Jeremy Pinto , Anaum Khan , Louis-François Bouchard , Chenglei Si , Svetlina Anati , Valen Tagliabue , Anson Liu Kost , Christopher Carnahan , Jordan Boyd-Graber

Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models

Large Language Models (LLMs) are increasingly equipped with capabilities of real-time web search and integrated with protocols like Model Context Protocol (MCP). This extension could introduce new security vulnerabilities. We present a…

Cryptography and Security · Computer Science 2025-05-23 Junjie Xiong , Changjia Zhu , Shuhang Lin , Chong Zhang , Yongfeng Zhang , Yao Liu , Lingyao Li

Prompt Obfuscation for Large Language Models

System prompts that include detailed instructions to describe the task performed by the underlying LLM can easily transform foundation models into tools and services with minimal overhead. They are often considered intellectual property,…

Cryptography and Security · Computer Science 2025-08-07 David Pape , Sina Mavali , Thorsten Eisenhofer , Lea Schönherr

System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection

Large language models (LLMs) have gained widespread adoption across diverse applications due to their impressive generative capabilities. Their plug-and-play nature enables both developers and end users to interact with these models through…

Cryptography and Security · Computer Science 2025-10-21 Zongze Li , Jiawei Guo , Haipeng Cai

Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications

The critical challenge of prompt injection attacks in Large Language Models (LLMs) integrated applications, a growing concern in the Artificial Intelligence (AI) field. Such attacks, which manipulate LLMs through natural language inputs,…

Cryptography and Security · Computer Science 2024-01-17 Xuchen Suo

PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models

Prompts have significantly improved the performance of pretrained Large Language Models (LLMs) on various downstream tasks recently, making them increasingly indispensable for a diverse range of LLM application scenarios. However, the…

Computation and Language · Computer Science 2023-12-19 Hongwei Yao , Jian Lou , Zhan Qin

Prompt Fencing: A Cryptographic Approach to Establishing Security Boundaries in Large Language Model Prompts

Large Language Models (LLMs) remain vulnerable to prompt injection attacks, representing the most significant security threat in production deployments. We present Prompt Fencing, a novel architectural approach that applies cryptographic…

Cryptography and Security · Computer Science 2025-11-26 Steven Peh

Hijacking Large Language Models via Adversarial In-Context Learning

In-context learning (ICL) has emerged as a powerful paradigm leveraging LLMs for specific downstream tasks by utilizing labeled examples as demonstrations (demos) in the preconditioned prompts. Despite its promising performance, crafted…

Machine Learning · Computer Science 2025-05-30 Xiangyu Zhou , Yao Qiang , Saleh Zare Zade , Prashant Khanduri , Dongxiao Zhu