Related papers: debug-gym: A Text-Based Environment for Interactiv…

A Systematic Approach for Large Language Models Debugging

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains a persistent challenge due to their…

Artificial Intelligence · Computer Science 2026-04-28 Basel Shbita , Anna Lisa Gentile , Bing Zhang , Sungeun An , Shailja Thakur , Shubhi Asthana , Yi Zhou , Saptha Surendran , Farhan Ahmed , Rohan Kulkarni , Yuya Jeremy Ong , Chad DeLuca , Hima Patel

Towards a Neural Debugger for Python

Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team…

Machine Learning · Computer Science 2026-03-11 Maximilian Beck , Jonas Gehring , Jannik Kossen , Gabriel Synnaeve

How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging

Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely…

Human-Computer Interaction · Computer Science 2024-10-11 Qianou Ma , Hua Shen , Kenneth Koedinger , Tongshuang Wu

Leveraging Print Debugging to Improve Code Generation in Large Language Models

Large language models (LLMs) have made significant progress in code generation tasks, but their performance in tackling programming problems with complex data structures and algorithms remains suboptimal. To address this issue, we propose…

Computation and Language · Computer Science 2024-01-11 Xueyu Hu , Kun Kuang , Jiankai Sun , Hongxia Yang , Fei Wu

Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants

The widespread availability of Large Language Models (LLMs) within Integrated Development Environments (IDEs) has led to their speedy adoption. Conversational interactions with LLMs enable programmers to obtain natural language explanations…

Human-Computer Interaction · Computer Science 2024-02-12 Bhavya Chopra , Yasharth Bajpai , Param Biyani , Gustavo Soares , Arjun Radhakrishna , Chris Parnin , Sumit Gulwani

DebugTA: An LLM-Based Agent for Simplifying Debugging and Teaching in Programming Education

In programming education, Debugging and Teaching (DT) task is a common scenario where students receive assistance in correcting their erroneous code. The task involves multiple inputs, including erroneous code, error messages, reference…

Software Engineering · Computer Science 2025-10-14 Lingyue Fu , Haowei Yuan , Datong Chen , Xinyi Dai , Qingyao Li , Weinan Zhang , Weiwen Liu , Yong Yu

Empowering Large Language Model Agents through Action Learning

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior. In this work, we argue that the capacity to learn new…

Artificial Intelligence · Computer Science 2024-08-09 Haiteng Zhao , Chang Ma , Guoyin Wang , Jing Su , Lingpeng Kong , Jingjing Xu , Zhi-Hong Deng , Hongxia Yang

NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging

Debugging is a critical aspect of LLM's coding ability. Early debugging efforts primarily focused on code-level analysis, which often falls short when addressing complex programming errors that require a deeper understanding of algorithmic…

Computation and Language · Computer Science 2025-10-30 Weiming Zhang , Qingyao Li , Xinyi Dai , Jizheng Chen , Kounianhua Du , Weiwen Liu , Yasheng Wang , Ruiming Tang , Yong Yu , Weinan Zhang

A Survey on Large Language Model-Based Game Agents

Game environments provide rich, controllable settings that stimulate many aspects of real-world complexity. As such, game agents offer a valuable testbed for exploring capabilities relevant to Artificial General Intelligence. Recently, the…

Artificial Intelligence · Computer Science 2025-11-05 Sihao Hu , Tiansheng Huang , Gaowen Liu , Ramana Rao Kompella , Fatih Ilhan , Selim Furkan Tekin , Yichang Xu , Zachary Yahn , Ling Liu

Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong…

Machine Learning · Computer Science 2026-02-02 Thomas Carta , Clément Romac , Thomas Wolf , Sylvain Lamprier , Olivier Sigaud , Pierre-Yves Oudeyer

Large Language Models Are Neurosymbolic Reasoners

A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning. This paper investigates the potential application of Large Language Models (LLMs) as symbolic…

Computation and Language · Computer Science 2024-01-18 Meng Fang , Shilong Deng , Yudi Zhang , Zijing Shi , Ling Chen , Mykola Pechenizkiy , Jun Wang

Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks

Large Language Models (LLMs) are transforming programming practices, offering significant capabilities for code generation activities. While researchers have explored the potential of LLMs in various domains, this paper focuses on their use…

Software Engineering · Computer Science 2026-05-04 Deborah Etsenake , Meiyappan Nagappan

Automatic Bug Detection in LLM-Powered Text-Based Games Using LLMs

Advancements in large language models (LLMs) are revolutionizing interactive game design, enabling dynamic plotlines and interactions between players and non-player characters (NPCs). However, LLMs may exhibit flaws such as hallucinations,…

Computation and Language · Computer Science 2025-02-25 Claire Jin , Sudha Rao , Xiangyu Peng , Portia Botchway , Jessica Quaye , Chris Brockett , Bill Dolan

Enhancing LLM Code Generation: A Systematic Evaluation of Multi-Agent Collaboration and Runtime Debugging for Improved Accuracy, Reliability, and Latency

The use of large language models (LLMs) for automated code generation has emerged as a significant focus within AI research. As these pretrained models continue to evolve, their ability to understand and generate complex code structures has…

Software Engineering · Computer Science 2025-05-06 Nazmus Ashrafi , Salah Bouktif , Mohammed Mediani

ChatDBG: Augmenting Debugging with Large Language Models

Debugging is a critical but challenging task for programmers. This paper proposes ChatDBG, an AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of…

Software Engineering · Computer Science 2025-06-23 Kyla H. Levin , Nicolas van Kempen , Emery D. Berger , Stephen N. Freund

Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents

Recent studies have uncovered the potential of Large Language Models (LLMs) in addressing complex sequential decision-making tasks through the provision of high-level instructions. However, LLM-based agents lack specialization in tackling…

Artificial Intelligence · Computer Science 2024-05-28 Zihao Zhou , Bin Hu , Chenyang Zhao , Pu Zhang , Bin Liu

NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing Tasks

Reinforcement learning (RL) has recently shown impressive performance in complex game AI and robotics tasks. To a large extent, this is thanks to the availability of simulated environments such as OpenAI Gym, Atari Learning Environment, or…

Computation and Language · Computer Science 2020-11-18 Rajkumar Ramamurthy , Rafet Sifa , Christian Bauckhage

MetaAgents: Large Language Model Based Agents for Decision-Making on Teaming

Significant advancements have occurred in the application of Large Language Models (LLMs) for social simulations. Despite this, their abilities to perform teaming in task-oriented social events are underexplored. Such capabilities are…

Artificial Intelligence · Computer Science 2025-08-18 Yuan Li , Lichao Sun , Yixuan Zhang

clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents

It has been established in recent work that Large Language Models (LLMs) can be prompted to "self-play" conversational games that probe certain capabilities (general instruction following, strategic goal orientation, language understanding…

Computation and Language · Computer Science 2024-06-03 Anne Beyer , Kranti Chalamalasetti , Sherzod Hakimov , Brielen Madureira , Philipp Sadler , David Schlangen