Related papers: UniDebugger: Hierarchical Multi-Agent Framework fo…

Where LLM Agents Fail and How They can Learn From Failures

Large Language Model (LLM) agents, which integrate planning, memory, reflection, and tool-use modules, have shown promise in solving complex, multi-step tasks. Yet their sophisticated architectures amplify vulnerability to cascading…

Artificial Intelligence · Computer Science 2025-10-01 Kunlun Zhu , Zijia Liu , Bingxuan Li , Muxin Tian , Yingxuan Yang , Jiaxun Zhang , Pengrui Han , Qipeng Xie , Fuyang Cui , Weijia Zhang , Xiaoteng Ma , Xiaodong Yu , Gowtham Ramesh , Jialian Wu , Zicheng Liu , Pan Lu , James Zou , Jiaxuan You

Debug2Fix: Can Interactive Debugging Help Coding Agents Fix More Bugs?

While significant progress has been made in automating various aspects of software development through coding agents, there is still significant room for improvement in their bug fixing capabilities. Debugging and investigation of runtime…

Software Engineering · Computer Science 2026-04-22 Spandan Garg , Yufan Huang

RepairAgent: An Autonomous, LLM-Based Agent for Program Repair

Automated program repair has emerged as a powerful technique to mitigate the impact of software bugs on system reliability and user experience. This paper introduces RepairAgent, the first work to address the program repair challenge…

Software Engineering · Computer Science 2024-10-29 Islem Bouzenia , Premkumar Devanbu , Michael Pradel

An Empirical Study on LLM-based Agents for Automated Bug Fixing

Large language models (LLMs) and LLM-based Agents have been applied to fix bugs automatically, demonstrating the capability in addressing software defects by engaging in development environment interaction, iterative validation and code…

Software Engineering · Computer Science 2025-10-21 Xiangxin Meng , Zexiong Ma , Pengfei Gao , Chao Peng

DebugHarness: Emulating Human Dynamic Debugging for Autonomous Program Repair

Patching severe security flaws in complex software remains a major challenge. While automated tools like fuzzers efficiently discover bugs, fixing deep-rooted low-level faults (e.g., use-after-free and memory corruption) still requires…

Software Engineering · Computer Science 2026-04-07 Maolin Sun , Yibiao Yang , Xuanlin Liu , Yuming Zhou , Baowen Xu

SelfHeal: Empirical Fix Pattern Analysis and Bug Repair in LLM Agents

Large Language Models (LLMs) have transformed software development and AI applications. While LLMs are designed for text processing, LLM agents extend this capability by enabling autonomous actions, tool use, and multi-step task completion.…

Software Engineering · Computer Science 2026-04-21 Niful Islam , Muhammad Anas Raza , Mohammad Wardat

AgentFL: Scaling LLM-based Fault Localization to Project-Level Context

Fault Localization (FL) is an essential step during the debugging process. With the strong capabilities of code comprehension, the recent Large Language Models (LLMs) have demonstrated promising performance in diagnosing bugs in the code.…

Software Engineering · Computer Science 2025-02-25 Yihao Qin , Shangwen Wang , Yiling Lou , Jinhao Dong , Kaixin Wang , Xiaoling Li , Xiaoguang Mao

StaAgent: An Agentic Framework for Testing Static Analyzers

Static analyzers play a critical role in identifying bugs early in the software development lifecycle, but their rule implementations are often under-tested and prone to inconsistencies. To address this, we propose StaAgent, an agentic…

Software Engineering · Computer Science 2025-07-23 Elijah Nnorom , Md Basim Uddin Ahmed , Jiho Shin , Hung Viet Pham , Song Wang

When Agents Fail: A Comprehensive Study of Bugs in LLM Agents with Automated Labeling

Large Language Models (LLMs) have revolutionized intelligent application development. While standalone LLMs cannot perform any actions, LLM agents address the limitation by integrating tools. However, debugging LLM agents is difficult and…

Software Engineering · Computer Science 2026-04-28 Niful Islam , Ragib Shahriar Ayon , Deepak George Thomas , Shibbir Ahmed , Mohammad Wardat

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

While fine-tuned large language models (LLMs) excel in generating grammatically valid SQL in Text-to-SQL parsing, they often struggle to ensure semantic accuracy in queries, leading to user confusion and diminished system usability. To…

Computation and Language · Computer Science 2025-05-20 Jipeng Cen , Jiaxin Liu , Zhixu Li , Jingjing Wang

AgentStepper: Interactive Debugging of Software Development Agents

Software development agents powered by large language models (LLMs) have shown great promise in automating tasks like environment setup, issue solving, and program repair. Unfortunately, understanding and debugging such agents remain…

Software Engineering · Computer Science 2026-02-09 Robert Hutter , Michael Pradel

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Large Language Models (LLMs) have made significant strides in code generation and problem solving. Current approaches employ external tool-based iterative debuggers that use compiler or other tool-based runtime feedback to refine coarse…

Computation and Language · Computer Science 2026-04-28 Md. Ashraful Islam , Mohammed Eunus Ali , Md Rizwan Parvez

FGDM: Reasoning Aware Multi-Agentic Framework for Software Bug Detection using Chain of Thought and Tree of Thought Prompting

Deep Learning methods are becoming prominent in automated software bug detection; however, they lack the global understanding of the given code. Consequently, their performance tends to degrade, especially when they are applied to large…

Software Engineering · Computer Science 2026-04-29 Srita Padmanabhuni , Bhargavi Karuturi , Jerusha Karen Indupalli , Santhan Reddy Chilla , Vivek Yelleti

Automatic Debugging Support for UML Designs

Design of large software systems requires rigorous application of software engineering methods covering all phases of the software process. Debugging during the early design phases is extremely important, because late bug-fixes are…

Software Engineering · Computer Science 2007-05-23 Johann Schumann

COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis

Code debugging is a vital stage of software development, essential for ensuring the reliability and performance of Large Language Models (LLMs) in the code generation task. Human debugging typically follows a multi-stage process, which…

Software Engineering · Computer Science 2025-02-13 Weiqing Yang , Hanbin Wang , Zhenghao Liu , Xinze Li , Yukun Yan , Shuo Wang , Yu Gu , Minghe Yu , Zhiyuan Liu , Ge Yu

DebugBench: Evaluating Debugging Capability of Large Language Models

Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs'…

Software Engineering · Computer Science 2024-06-07 Runchu Tian , Yining Ye , Yujia Qin , Xin Cong , Yankai Lin , Yinxu Pan , Yesai Wu , Haotian Hui , Weichuan Liu , Zhiyuan Liu , Maosong Sun

VerifiAgent: a Unified Verification Agent in Language Model Reasoning

Large language models demonstrate remarkable reasoning capabilities but often produce unreliable or incorrect responses. Existing verification methods are typically model-specific or domain-restricted, requiring significant computational…

Computation and Language · Computer Science 2025-08-22 Jiuzhou Han , Wray Buntine , Ehsan Shareghi

Agent That Debugs: Dynamic State-Guided Vulnerability Repair

In recent years, more vulnerabilities have been discovered every day, while manual vulnerability repair requires specialized knowledge and is time-consuming. As a result, many detected or even published vulnerabilities remain unpatched,…

Software Engineering · Computer Science 2025-04-11 Zhengyao Liu , Yunlong Ma , Jingxuan Xu , Junchen Ai , Xiang Gao , Hailong Sun , Abhik Roychoudhury

SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair

Large Language Models (LLMs) have enabled intelligent agents that autonomously interact with environments and invoke external tools. Recently, agent-based software repair has drawn wide attention, as repair agents can localize bugs,…

Software Engineering · Computer Science 2026-05-26 Quanjun Zhang , Chengyu Gao , Yu Han , Ye Shang , Chunrong Fang , Zhenyu Chen , Liang Xiao

EngiAgent: Fully Connected Coordination of LLM Agents for Solving Open-ended Engineering Problems with Feasible Solutions

Engineering problem solving is central to real-world decision-making, requiring mathematical formulations that not only represent complex problems but also produce feasible solutions under data and physical constraints. Unlike mathematical…

Artificial Intelligence · Computer Science 2026-05-05 Xiyuan Zhou , Ruixi Zou , Xinlei Wang , Yuheng Cheng , Yan Xu , Junhua Zhao , Jinjin Gu