Related papers: Practical Reasoning Interruption Attacks on Reason…

Token-Efficient Prompt Injection Attack: Provoking Cessation in LLM Reasoning via Adaptive Token Compression

While reasoning large language models (LLMs) demonstrate remarkable performance across various tasks, they also contain notable security vulnerabilities. Recent research has uncovered a "thinking-stopped" vulnerability in DeepSeek-R1, where…

Cryptography and Security · Computer Science 2025-04-30 Yu Cui , Yujun Cai , Yiwei Wang

One Token Embedding Is Enough to Deadlock Your Large Reasoning Model

Modern large reasoning models (LRMs) exhibit impressive multi-step problem-solving via chain-of-thought (CoT) reasoning. However, this iterative thinking mechanism introduces a new vulnerability surface. We present the Deadlock Attack, a…

Machine Learning · Computer Science 2025-10-21 Mohan Zhang , Yihua Zhang , Jinghan Jia , Zhangyang Wang , Sijia Liu , Tianlong Chen

DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning

Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an answer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly…

Computation and Language · Computer Science 2026-01-19 Sara Vera Marjanović , Arkil Patel , Vaibhav Adlakha , Milad Aghajohari , Parishad BehnamGhader , Mehar Bhatia , Aditi Khandelwal , Austin Kraft , Benno Krojer , Xing Han Lù , Nicholas Meade , Dongchan Shin , Amirhossein Kazemnejad , Gaurav Kamath , Marius Mosbach , Karolina Stańczak , Siva Reddy

Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps

Recent reasoning large language models (LLMs) have demonstrated remarkable improvements in mathematical reasoning capabilities through long Chain-of-Thought. The reasoning tokens of these models enable self-correction within reasoning…

Artificial Intelligence · Computer Science 2025-04-02 Yu Cui , Bryan Hooi , Yujun Cai , Yiwei Wang

Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

Recent advances in large reasoning models (LRMs) have enabled remarkable performance on complex tasks such as mathematics and coding by generating long Chain-of-Thought (CoT) traces. In this paper, we identify and systematically analyze a…

Artificial Intelligence · Computer Science 2025-10-21 Zhehao Zhang , Weijie Xu , Shixian Cui , Chandan K. Reddy

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

The rapid development of large reasoning models (LRMs), such as OpenAI-o3 and DeepSeek-R1, has led to significant improvements in complex reasoning over non-reasoning large language models~(LLMs). However, their enhanced capabilities,…

Computers and Society · Computer Science 2025-11-18 Kaiwen Zhou , Chengzhi Liu , Xuandong Zhao , Shreedhar Jangam , Jayanth Srinivasa , Gaowen Liu , Dawn Song , Xin Eric Wang

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Large Reasoning Models (LRMs) have achieved remarkable success on reasoning-intensive tasks such as mathematics and programming. However, their enhanced reasoning capabilities do not necessarily translate to improved safety performance-and…

Computation and Language · Computer Science 2026-04-21 Zhexin Zhang , Xian Qi Loye , Victor Shea-Jay Huang , Junxiao Yang , Qi Zhu , Shiyao Cui , Fei Mi , Lifeng Shang , Yingkang Wang , Hongning Wang , Minlie Huang

OverThink: Slowdown Attacks on Reasoning LLMs

Most flagship language models generate explicit reasoning chains, enabling inference-time scaling. However, producing these reasoning chains increases token usage (i.e., reasoning tokens), which in turn increases latency and costs. Our…

Machine Learning · Computer Science 2026-02-05 Abhinav Kumar , Jaechul Roh , Ali Naseh , Marzena Karpinska , Mohit Iyyer , Amir Houmansadr , Eugene Bagdasarian

System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection

Large language models (LLMs) have gained widespread adoption across diverse applications due to their impressive generative capabilities. Their plug-and-play nature enables both developers and end users to interact with these models through…

Cryptography and Security · Computer Science 2025-10-21 Zongze Li , Jiawei Guo , Haipeng Cai

Chain-of-Thought Poisoning Attacks against R1-based Retrieval-Augmented Generation Systems

Retrieval-augmented generation (RAG) systems can effectively mitigate the hallucination problem of large language models (LLMs),but they also possess inherent vulnerabilities. Identifying these weaknesses before the large-scale real-world…

Information Retrieval · Computer Science 2025-05-23 Hongru Song , Yu-an Liu , Ruqing Zhang , Jiafeng Guo , Yixing Fan

RRTL: Red Teaming Reasoning Large Language Models in Tool Learning

While tool learning significantly enhances the capabilities of large language models (LLMs), it also introduces substantial security risks. Prior research has revealed various vulnerabilities in traditional LLMs during tool learning.…

Computation and Language · Computer Science 2025-05-26 Yifei Liu , Yu Cui , Haibin Zhang

Output Length Effect on DeepSeek-R1's Safety in Forced Thinking

Large Language Models (LLMs) have demonstrated strong reasoning capabilities, but their safety under adversarial conditions remains a challenge. This study examines the impact of output length on the robustness of DeepSeek-R1, particularly…

Computation and Language · Computer Science 2025-03-05 Xuying Li , Zhuo Li , Yuji Kosuga , Victor Bian

Excessive Reasoning Attack on Reasoning LLMs

Recent reasoning large language models (LLMs), such as OpenAI o1 and DeepSeek-R1, exhibit strong performance on complex tasks through test-time inference scaling. However, prior studies have shown that these models often incur significant…

Cryptography and Security · Computer Science 2025-06-18 Wai Man Si , Mingjie Li , Michael Backes , Yang Zhang

SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities

Emerging large reasoning models (LRMs), such as DeepSeek-R1 models, leverage long chain-of-thought (CoT) reasoning to generate structured intermediate steps, enhancing their reasoning capabilities. However, long CoT does not inherently…

Artificial Intelligence · Computer Science 2025-02-18 Fengqing Jiang , Zhangchen Xu , Yuetai Li , Luyao Niu , Zhen Xiang , Bo Li , Bill Yuchen Lin , Radha Poovendran

Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing

Large Reasoning Models (LRMs) have demonstrated strong capabilities in generating step-by-step reasoning chains alongside final answers, enabling their deployment in high-stakes domains such as healthcare and education. While prior…

Machine Learning · Computer Science 2026-04-20 Zehao Wang , Lanjun Wang

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces, but this capability introduces a new class of prompt-induced inference-time denial-of-service (PI-DoS) attacks that exploit the high…

Cryptography and Security · Computer Science 2026-02-03 Xiaogeng Liu , Xinyan Wang , Yechao Zhang , Sanjay Kariyappa , Chong Xiang , Muhao Chen , G. Edward Suh , Chaowei Xiao

Preemptive Answer "Attacks" on Chain-of-Thought Reasoning

Large language models (LLMs) showcase impressive reasoning capabilities when coupled with Chain-of-Thought (CoT) prompting. However, the robustness of this approach warrants further investigation. In this paper, we introduce a novel…

Computation and Language · Computer Science 2024-06-03 Rongwu Xu , Zehan Qi , Wei Xu

Stepwise Reasoning Error Disruption Attack of LLMs

Large language models (LLMs) have made remarkable strides in complex reasoning tasks, but their safety and robustness in reasoning processes remain underexplored. Existing attacks on LLM reasoning are constrained by specific settings or…

Artificial Intelligence · Computer Science 2025-06-17 Jingyu Peng , Maolin Wang , Xiangyu Zhao , Kai Zhang , Wanyu Wang , Pengyue Jia , Qidong Liu , Ruocheng Guo , Qi Liu

When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs

Reasoning-enhanced large language models (RLLMs), whether explicitly trained for reasoning or prompted via chain-of-thought (CoT), have achieved state-of-the-art performance on many complex reasoning tasks. However, we uncover a surprising…

Computation and Language · Computer Science 2025-09-03 Xiaomin Li , Zhou Yu , Zhiwei Zhang , Xupeng Chen , Ziji Zhang , Yingying Zhuang , Narayanan Sadagopan , Anurag Beniwal

RAPO: Risk-Aware Preference Optimization for Generalizable Safe Reasoning

Large Reasoning Models (LRMs) have achieved tremendous success with their chain-of-thought (CoT) reasoning, yet also face safety issues similar to those of basic language models. In particular, while algorithms are designed to guide them to…

Machine Learning · Computer Science 2026-02-05 Zeming Wei , Qiaosheng Zhang , Xia Hu , Xingcheng Xu