English
Related papers

Related papers: RedCodeAgent: Automatic Red-teaming Agent against …

200 papers

As large language models (LLMs) are increasingly used for code generation, concerns over the security risks have grown substantially. Early research has primarily focused on red teaming, which aims to uncover and evaluate vulnerabilities…

Software Engineering · Computer Science 2025-10-22 Chengquan Guo , Yuzhou Nie , Chulin Xie , Zinan Lin , Wenbo Guo , Bo Li

Recently, advanced Large Language Models (LLMs) such as GPT-4 have been integrated into many real-world applications like Code Copilot. These applications have significantly expanded the attack surface of LLMs, exposing them to a variety of…

Cryptography and Security · Computer Science 2024-07-24 Huiyu Xu , Wenhui Zhang , Zhibo Wang , Feng Xiao , Rui Zheng , Yunhe Feng , Zhongjie Ba , Kui Ren

Large Language Models (LLMs) for code generation (i.e., Code LLMs) have demonstrated impressive capabilities in AI-assisted software development and testing. However, recent studies have shown that these models are prone to generating…

Software Engineering · Computer Science 2025-07-31 Wenjie Jacky Mo , Qin Liu , Xiaofei Wen , Dongwon Jung , Hadi Askari , Wenxuan Zhou , Zhe Zhao , Muhao Chen

With the rapidly increasing capabilities and adoption of code agents for AI-assisted coding, safety concerns, such as generating or executing risky code, have become significant barriers to the real-world deployment of these agents. To…

Software Engineering · Computer Science 2024-11-13 Chengquan Guo , Xun Liu , Chulin Xie , Andy Zhou , Yi Zeng , Zinan Lin , Dawn Song , Bo Li

As large language models (LLMs) become increasingly capable, security and safety evaluation are crucial. While current red teaming approaches have made strides in assessing LLM vulnerabilities, they often rely heavily on human input and…

Cryptography and Security · Computer Science 2025-03-21 Andy Zhou , Kevin Wu , Francesco Pinto , Zhaorun Chen , Yi Zeng , Yu Yang , Shuang Yang , Sanmi Koyejo , James Zou , Bo Li

AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due to their high capability and flexibility, such agents raise significant security and safety…

Foundation model-based agents are increasingly used to automate complex tasks, enhancing efficiency and productivity. However, their access to sensitive resources and autonomous decision-making also introduce significant security risks,…

Cryptography and Security · Computer Science 2025-06-03 Chejian Xu , Mintong Kang , Jiawei Zhang , Zeyi Liao , Lingbo Mo , Mengqi Yuan , Huan Sun , Bo Li

Large language models (LLMs) have shown promise in assisting cybersecurity tasks, yet existing approaches struggle with automatic vulnerability discovery and exploitation due to limited interaction, weak execution grounding, and a lack of…

Recent studies have discovered that large language models (LLM) may be ``fooled'' to output private information, including training data, system prompts, and personally identifiable information, under carefully crafted adversarial prompts.…

Cryptography and Security · Computer Science 2025-08-11 Yuzhou Nie , Zhun Wang , Ye Yu , Xian Wu , Xuandong Zhao , Wenbo Guo , Dawn Song

Coding agents powered by large language models are becoming central modules of modern IDEs, helping users perform complex tasks by invoking tools. While powerful, tool invocation opens a substantial attack surface. Prior work has…

Cryptography and Security · Computer Science 2026-01-06 Yuchong Xie , Mingyu Luo , Zesen Liu , Zhixiang Zhang , Kaikai Zhang , Yu Liu , Zongjie Li , Ping Chen , Shuai Wang , Dongdong She

Cybersecurity threats are becoming increasingly sophisticated, making traditional defense mechanisms and manual red teaming approaches insufficient for modern organizations. While red teaming has long been recognized as an effective method…

Cryptography and Security · Computer Science 2026-02-26 Shruti Srivastava , Kiranmayee Janardhan , Shaurya Jauhari

We introduce RedDebate, a novel multi-agent debate framework that provides the foundation for Large Language Models (LLMs) to identify and mitigate their unsafe behaviours. Existing AI safety approaches often rely on costly human evaluation…

Computation and Language · Computer Science 2025-10-13 Ali Asad , Stephen Obadinma , Radin Shayanfar , Xiaodan Zhu

In this paper we introduce ResearchCodeAgent, a novel multi-agent system leveraging large language models (LLMs) agents to automate the codification of research methodologies described in machine learning literature. The system bridges the…

Software Engineering · Computer Science 2025-05-06 Shubham Gandhi , Dhruv Shah , Manasi Patwardhan , Lovekesh Vig , Gautam Shroff

Computer-use agents (CUAs) promise to automate complex tasks across operating systems (OS) and the web, but remain vulnerable to indirect prompt injection. Current evaluations of this threat either lack support realistic but controlled…

Computation and Language · Computer Science 2026-03-03 Zeyi Liao , Jaylen Jones , Linxi Jiang , Yuting Ning , Eric Fosler-Lussier , Yu Su , Zhiqiang Lin , Huan Sun

We introduce a red-teaming methodology that exposes harder-to-catch attacks for coding-agent monitors, suggesting that current practices may under-elicit attacks and overstate monitor performance. We identify three challenges with current…

Cryptography and Security · Computer Science 2026-05-12 Monika Jotautaitė , Maria Angelica Martinez , Ollie Matthews , Tyler Tracy

From automated intrusion testing to discovery of zero-day attacks before software launch, agentic AI calls for great promises in security engineering. This strong capability is bound with a similar threat: the security and research…

Cryptography and Security · Computer Science 2025-05-13 Brian Challita , Pierre Parrend

Warning: This paper contains content that may be inappropriate or offensive. AI agents have gained significant recent attention due to their autonomous tool usage capabilities and their integration in various real-world applications. This…

Artificial Intelligence · Computer Science 2025-06-24 Ninareh Mehrabi , Tharindu Kumarage , Kai-Wei Chang , Aram Galstyan , Rahul Gupta

LLM-based agent systems increasingly rely on agent skills sourced from open registries to extend their capabilities, yet the openness of such ecosystems makes skills difficult to thoroughly vet. Existing attacks rely on injecting malicious…

Cryptography and Security · Computer Science 2026-04-08 Zenghao Duan , Yuxin Tian , Zhiyi Yin , Liang Pang , Jingcheng Deng , Zihao Wei , Shicheng Xu , Yuyao Ge , Xueqi Cheng

The rapid advancement of Vision-Language Models (VLMs) has brought their safety vulnerabilities into sharp focus. However, existing red teaming methods are fundamentally constrained by an inherent linear exploration paradigm, confining them…

Machine Learning · Computer Science 2026-03-25 Chunxiao Li , Lijun Li , Jing Shao

Recent advancements in automatic code generation using large language models (LLMs) have brought us closer to fully automated secure software development. However, existing approaches often rely on a single agent for code generation, which…

Software Engineering · Computer Science 2024-11-06 Ana Nunez , Nafis Tanveer Islam , Sumit Kumar Jha , Peyman Najafirad
‹ Prev 1 2 3 10 Next ›