English
Related papers

Related papers: debug-gym: A Text-Based Environment for Interactiv…

200 papers

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains a persistent challenge due to their…

Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team…

Machine Learning · Computer Science 2026-03-11 Maximilian Beck , Jonas Gehring , Jannik Kossen , Gabriel Synnaeve

Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely…

Human-Computer Interaction · Computer Science 2024-10-11 Qianou Ma , Hua Shen , Kenneth Koedinger , Tongshuang Wu

Large language models (LLMs) have made significant progress in code generation tasks, but their performance in tackling programming problems with complex data structures and algorithms remains suboptimal. To address this issue, we propose…

Computation and Language · Computer Science 2024-01-11 Xueyu Hu , Kun Kuang , Jiankai Sun , Hongxia Yang , Fei Wu

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

The widespread availability of Large Language Models (LLMs) within Integrated Development Environments (IDEs) has led to their speedy adoption. Conversational interactions with LLMs enable programmers to obtain natural language explanations…

Human-Computer Interaction · Computer Science 2024-02-12 Bhavya Chopra , Yasharth Bajpai , Param Biyani , Gustavo Soares , Arjun Radhakrishna , Chris Parnin , Sumit Gulwani

In programming education, Debugging and Teaching (DT) task is a common scenario where students receive assistance in correcting their erroneous code. The task involves multiple inputs, including erroneous code, error messages, reference…

Software Engineering · Computer Science 2025-10-14 Lingyue Fu , Haowei Yuan , Datong Chen , Xinyi Dai , Qingyao Li , Weinan Zhang , Weiwen Liu , Yong Yu

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior. In this work, we argue that the capacity to learn new…

Artificial Intelligence · Computer Science 2024-08-09 Haiteng Zhao , Chang Ma , Guoyin Wang , Jing Su , Lingpeng Kong , Jingjing Xu , Zhi-Hong Deng , Hongxia Yang

Debugging is a critical aspect of LLM's coding ability. Early debugging efforts primarily focused on code-level analysis, which often falls short when addressing complex programming errors that require a deeper understanding of algorithmic…

Computation and Language · Computer Science 2025-10-30 Weiming Zhang , Qingyao Li , Xinyi Dai , Jizheng Chen , Kounianhua Du , Weiwen Liu , Yasheng Wang , Ruiming Tang , Yong Yu , Weinan Zhang

Game environments provide rich, controllable settings that stimulate many aspects of real-world complexity. As such, game agents offer a valuable testbed for exploring capabilities relevant to Artificial General Intelligence. Recently, the…

Artificial Intelligence · Computer Science 2025-11-05 Sihao Hu , Tiansheng Huang , Gaowen Liu , Ramana Rao Kompella , Fatih Ilhan , Selim Furkan Tekin , Yichang Xu , Zachary Yahn , Ling Liu

Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong…

Machine Learning · Computer Science 2026-02-02 Thomas Carta , Clément Romac , Thomas Wolf , Sylvain Lamprier , Olivier Sigaud , Pierre-Yves Oudeyer

A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning. This paper investigates the potential application of Large Language Models (LLMs) as symbolic…

Computation and Language · Computer Science 2024-01-18 Meng Fang , Shilong Deng , Yudi Zhang , Zijing Shi , Ling Chen , Mykola Pechenizkiy , Jun Wang

Large Language Models (LLMs) are transforming programming practices, offering significant capabilities for code generation activities. While researchers have explored the potential of LLMs in various domains, this paper focuses on their use…

Software Engineering · Computer Science 2026-05-04 Deborah Etsenake , Meiyappan Nagappan

Advancements in large language models (LLMs) are revolutionizing interactive game design, enabling dynamic plotlines and interactions between players and non-player characters (NPCs). However, LLMs may exhibit flaws such as hallucinations,…

Computation and Language · Computer Science 2025-02-25 Claire Jin , Sudha Rao , Xiangyu Peng , Portia Botchway , Jessica Quaye , Chris Brockett , Bill Dolan

The use of large language models (LLMs) for automated code generation has emerged as a significant focus within AI research. As these pretrained models continue to evolve, their ability to understand and generate complex code structures has…

Software Engineering · Computer Science 2025-05-06 Nazmus Ashrafi , Salah Bouktif , Mohammed Mediani

Debugging is a critical but challenging task for programmers. This paper proposes ChatDBG, an AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of…

Software Engineering · Computer Science 2025-06-23 Kyla H. Levin , Nicolas van Kempen , Emery D. Berger , Stephen N. Freund

Recent studies have uncovered the potential of Large Language Models (LLMs) in addressing complex sequential decision-making tasks through the provision of high-level instructions. However, LLM-based agents lack specialization in tackling…

Artificial Intelligence · Computer Science 2024-05-28 Zihao Zhou , Bin Hu , Chenyang Zhao , Pu Zhang , Bin Liu

Reinforcement learning (RL) has recently shown impressive performance in complex game AI and robotics tasks. To a large extent, this is thanks to the availability of simulated environments such as OpenAI Gym, Atari Learning Environment, or…

Computation and Language · Computer Science 2020-11-18 Rajkumar Ramamurthy , Rafet Sifa , Christian Bauckhage

Significant advancements have occurred in the application of Large Language Models (LLMs) for social simulations. Despite this, their abilities to perform teaming in task-oriented social events are underexplored. Such capabilities are…

Artificial Intelligence · Computer Science 2025-08-18 Yuan Li , Lichao Sun , Yixuan Zhang

It has been established in recent work that Large Language Models (LLMs) can be prompted to "self-play" conversational games that probe certain capabilities (general instruction following, strategic goal orientation, language understanding…

Computation and Language · Computer Science 2024-06-03 Anne Beyer , Kranti Chalamalasetti , Sherzod Hakimov , Brielen Madureira , Philipp Sadler , David Schlangen
‹ Prev 1 2 3 10 Next ›