Related papers: AgentStepper: Interactive Debugging of Software De…

Interactive Debugging and Steering of Multi-Agent AI Systems

Fully autonomous teams of LLM-powered AI agents are emerging that collaborate to perform complex tasks for users. What challenges do developers face when trying to build and debug these AI agent teams? In formative interviews with five AI…

Multiagent Systems · Computer Science 2025-03-06 Will Epperson , Gagan Bansal , Victor Dibia , Adam Fourney , Jack Gerrits , Erkang Zhu , Saleema Amershi

Defining and Detecting the Defects of the Large Language Model-based Autonomous Agents

AI agents are systems capable of perceiving their environment, autonomously planning and executing tasks. Recent advancements in LLM have introduced a transformative paradigm for AI agents, enabling them to interact with external resources…

Software Engineering · Computer Science 2024-12-30 Kaiwen Ning , Jiachi Chen , Jingwen Zhang , Wei Li , Zexu Wang , Yuming Feng , Weizhe Zhang , Zibin Zheng

Where LLM Agents Fail and How They can Learn From Failures

Large Language Model (LLM) agents, which integrate planning, memory, reflection, and tool-use modules, have shown promise in solving complex, multi-step tasks. Yet their sophisticated architectures amplify vulnerability to cascading…

Artificial Intelligence · Computer Science 2025-10-01 Kunlun Zhu , Zijia Liu , Bingxuan Li , Muxin Tian , Yingxuan Yang , Jiaxun Zhang , Pengrui Han , Qipeng Xie , Fuyang Cui , Weijia Zhang , Xiaoteng Ma , Xiaodong Yu , Gowtham Ramesh , Jialian Wu , Zicheng Liu , Pan Lu , James Zou , Jiaxuan You

Agentless: Demystifying LLM-based Software Engineering Agents

Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry…

Software Engineering · Computer Science 2024-10-30 Chunqiu Steven Xia , Yinlin Deng , Soren Dunn , Lingming Zhang

InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration

Large Language Models (LLMs) frequently generate buggy code with complex logic errors that are challenging to diagnose. While existing LLM-based self-repair approaches conduct intensive static semantic analysis or reply on superficial…

Software Engineering · Computer Science 2025-10-22 Yunkun Wang , Yue Zhang , Guochang Li , Chen Zhi , Binhua Li , Fei Huang , Yongbin Li , Shuiguang Deng

LLM Agents Making Agent Tools

Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks by dynamically utilising external software components. However, these tools must be implemented in advance by human developers,…

Computation and Language · Computer Science 2025-06-02 Georg Wölflein , Dyke Ferber , Daniel Truhn , Ognjen Arandjelović , Jakob Nikolas Kather

Towards Adaptive Software Agents for Debugging

Using multiple agents was found to improve the debugging capabilities of Large Language Models. However, increasing the number of LLM-agents has several drawbacks such as increasing the running costs and rising the risk for the agents to…

Software Engineering · Computer Science 2025-04-28 Yacine Majdoub , Eya Ben Charrada , Haifa Touati

An Empirical Study of Agent Developer Practices in AI Agent Frameworks

The rise of large language models (LLMs) has sparked a surge of interest in agents, leading to the rapid growth of agent frameworks. Agent frameworks are software toolkits and libraries that provide standardized components, abstractions,…

Software Engineering · Computer Science 2025-12-02 Yanlin Wang , Xinyi Xu , Jiachi Chen , Tingting Bi , Wenchao Gu , Zibin Zheng

AgentMesh: A Cooperative Multi-Agent Generative AI Framework for Software Development Automation

Software development is a complex, multi-phase process traditionally requiring collaboration among individuals with diverse expertise. We propose AgentMesh, a Python-based framework that uses multiple cooperating LLM-powered agents to…

Software Engineering · Computer Science 2025-07-29 Sourena Khanzadeh

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs). These models have revolutionized NLP tasks, particularly in code generation, aiding…

Computation and Language · Computer Science 2024-05-27 Dong Huang , Jie M. Zhang , Michael Luck , Qingwen Bu , Yuhao Qing , Heming Cui

Agentic Software Issue Resolution with Large Language Models: A Survey

Software issue resolution aims to address real-world issues in software repositories (e.g., bug fixing and efficiency optimization) based on natural language descriptions provided by users, representing a key aspect of software maintenance.…

Software Engineering · Computer Science 2025-12-30 Zhonghao Jiang , David Lo , Zhongxin Liu

AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Driven by rapid advancements of Large Language Models (LLMs), agents are empowered to combine intrinsic knowledge with dynamic tool use, greatly enhancing their capacity to address real-world tasks. In line with such an evolution,…

Artificial Intelligence · Computer Science 2025-08-25 Dawei Gao , Zitao Li , Yuexiang Xie , Weirui Kuang , Liuyi Yao , Bingchen Qian , Zhijian Ma , Yue Cui , Haohao Luo , Shen Li , Lu Yi , Yi Yu , Shiqi He , Zhiling Luo , Wenmeng Zhou , Zhicheng Zhang , Xuguang He , Ziqian Chen , Weikai Liao , Farruh Isakulovich Kushnazarov , Yaliang Li , Bolin Ding , Jingren Zhou

Agent-Driven Automatic Software Improvement

With software maintenance accounting for 50% of the cost of developing software, enhancing code quality and reliability has become more critical than ever. In response to this challenge, this doctoral research proposal aims to explore…

Software Engineering · Computer Science 2024-06-25 Fernando Vallecillos Ruiz

AgentScope: A Flexible yet Robust Multi-Agent Platform

With the rapid advancement of Large Language Models (LLMs), significant progress has been made in multi-agent applications. However, the complexities in coordinating agents' cooperation and LLMs' erratic performance pose notable challenges…

Multiagent Systems · Computer Science 2024-05-21 Dawei Gao , Zitao Li , Xuchen Pan , Weirui Kuang , Zhijian Ma , Bingchen Qian , Fei Wei , Wenhao Zhang , Yuexiang Xie , Daoyuan Chen , Liuyi Yao , Hongyi Peng , Zeyu Zhang , Lin Zhu , Chen Cheng , Hongzhu Shi , Yaliang Li , Bolin Ding , Jingren Zhou

AgentFL: Scaling LLM-based Fault Localization to Project-Level Context

Fault Localization (FL) is an essential step during the debugging process. With the strong capabilities of code comprehension, the recent Large Language Models (LLMs) have demonstrated promising performance in diagnosing bugs in the code.…

Software Engineering · Computer Science 2025-02-25 Yihao Qin , Shangwen Wang , Yiling Lou , Jinhao Dong , Kaixin Wang , Xiaoling Li , Xiaoguang Mao

Agentic AI Process Observability: Discovering Behavioral Variability

AI agents that leverage Large Language Models (LLMs) are increasingly becoming core building blocks of modern software systems. A wide range of frameworks is now available to support the specification of such applications. These frameworks…

Artificial Intelligence · Computer Science 2025-11-04 Fabiana Fournier , Lior Limonad , Yuval David

Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories

Large Language Model (LLM)-based agents are increasingly employed to automate complex software engineering tasks, such as program repair and issue resolution. These agents operate by autonomously generating natural language thoughts,…

Software Engineering · Computer Science 2025-10-09 Islem Bouzenia , Michael Pradel

AgentEvolver: Towards Efficient Self-Evolving Agent System

Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments. However, current approaches to…

Machine Learning · Computer Science 2025-11-14 Yunpeng Zhai , Shuchang Tao , Cheng Chen , Anni Zou , Ziqian Chen , Qingxu Fu , Shinji Mai , Li Yu , Jiaji Deng , Zouying Cao , Zhaoyang Liu , Bolin Ding , Jingren Zhou

UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging

Software debugging is a time-consuming endeavor involving a series of steps, such as fault localization and patch generation, each requiring thorough analysis and a deep understanding of the underlying logic. While large language models…

Software Engineering · Computer Science 2025-11-19 Cheryl Lee , Chunqiu Steven Xia , Longji Yang , Jen-tse Huang , Zhouruixin Zhu , Lingming Zhang , Michael R. Lyu

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress

Despite rapid development, large language models (LLMs) still encounter challenges in multi-turn decision-making tasks (i.e., agent tasks) like web shopping and browser navigation, which require making a sequence of intelligent decisions…

Computation and Language · Computer Science 2025-11-12 Zhiheng Xi , Chenyang Liao , Guanyu Li , Yajie Yang , Wenxiang Chen , Zhihao Zhang , Binghai Wang , Senjie Jin , Yuhao Zhou , Jian Guan , Wei Wu , Tao Ji , Tao Gui , Qi Zhang , Xuanjing Huang