Related papers: ChatDBG: Augmenting Debugging with Large Language …

Debugging with Open-Source Large Language Models: An Evaluation

Large language models have shown good potential in supporting software development tasks. This is why more and more developers turn to LLMs (e.g. ChatGPT) to support them in fixing their buggy code. While this can save time and effort, many…

Software Engineering · Computer Science 2024-09-06 Yacine Majdoub , Eya Ben Charrada

Extending the Frontier of ChatGPT: Code Generation and Debugging

Large-scale language models (LLMs) have emerged as a groundbreaking innovation in the realm of question-answering and conversational agents. These models, leveraging different deep learning architectures such as Transformers, are trained on…

Software Engineering · Computer Science 2023-07-18 Fardin Ahsan Sakib , Saadat Hasan Khan , A. H. M. Rezaul Karim

DebugBench: Evaluating Debugging Capability of Large Language Models

Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs'…

Software Engineering · Computer Science 2024-06-07 Runchu Tian , Yining Ye , Yujia Qin , Xin Cong , Yankai Lin , Yinxu Pan , Yesai Wu , Haotian Hui , Weichuan Liu , Zhiyuan Liu , Maosong Sun

An Empirical Study on the Capability of LLMs in Decomposing Bug Reports

Background: Bug reports are essential to the software development life cycle. They help developers track and resolve issues, but are often difficult to process due to their complexity, which can delay resolution and affect software quality.…

Software Engineering · Computer Science 2025-04-30 Zhiyuan Chen , Vanessa Nava-Camal , Ahmad Suleiman , Yiming Tang , Daqing Hou , Weiyi Shang

ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning

Large language models (LLMs) such as ChatGPT and GPT-4 have demonstrated impressive capabilities in various generative tasks. However, their performance is often hampered by limitations in accessing and leveraging long-term memory, leading…

Artificial Intelligence · Computer Science 2024-07-16 Zhongsheng Wang , Jiamou Liu , Qiming Bao , Hongfei Rong , Jingfeng Zhang

Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

A Systematic Approach for Large Language Models Debugging

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains a persistent challenge due to their…

Artificial Intelligence · Computer Science 2026-04-28 Basel Shbita , Anna Lisa Gentile , Bing Zhang , Sungeun An , Shailja Thakur , Shubhi Asthana , Yi Zhou , Saptha Surendran , Farhan Ahmed , Rohan Kulkarni , Yuya Jeremy Ong , Chad DeLuca , Hima Patel

ChatDev: Communicative Agents for Software Development

Software development is a complex task that necessitates cooperation among multiple members with diverse skills. Numerous studies used deep learning to improve specific phases in a waterfall model, such as design, coding, and testing.…

Software Engineering · Computer Science 2024-06-06 Chen Qian , Wei Liu , Hongzhang Liu , Nuo Chen , Yufan Dang , Jiahao Li , Cheng Yang , Weize Chen , Yusheng Su , Xin Cong , Juyuan Xu , Dahai Li , Zhiyuan Liu , Maosong Sun

Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code

Large language models (LLMs) like ChatGPT have shown the potential to assist developers with coding and debugging tasks. However, their role in collaborative issue resolution is underexplored. In this study, we analyzed 1,152…

Software Engineering · Computer Science 2024-12-12 Joy Krishan Das , Saikat Mondal , Chanchal K. Roy

DevGPT: Studying Developer-ChatGPT Conversations

This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code…

Software Engineering · Computer Science 2024-02-15 Tao Xiao , Christoph Treude , Hideaki Hata , Kenichi Matsumoto

Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants

The widespread availability of Large Language Models (LLMs) within Integrated Development Environments (IDEs) has led to their speedy adoption. Conversational interactions with LLMs enable programmers to obtain natural language explanations…

Human-Computer Interaction · Computer Science 2024-02-12 Bhavya Chopra , Yasharth Bajpai , Param Biyani , Gustavo Soares , Arjun Radhakrishna , Chris Parnin , Sumit Gulwani

A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair

Large Language Models (LLMs) have been gaining increasing attention and demonstrated promising performance across a variety of Software Engineering (SE) tasks, such as Automated Program Repair (APR), code summarization, and code completion.…

Software Engineering · Computer Science 2024-04-18 Quanjun Zhang , Tongke Zhang , Juan Zhai , Chunrong Fang , Bowen Yu , Weisong Sun , Zhenyu Chen

ProDebug: An Automated Debugging System for Prolog

Prolog is a well-known declarative programming language commonly used in introductory courses on logic and reasoning. However, many students find Prolog challenging because it lacks the familiar debugging mechanisms found in imperative…

Programming Languages · Computer Science 2026-05-27 Ricardo Brancas , Vasco Manquinho , Ruben Martins

Can ChatGPT Support Developers? An Empirical Evaluation of Large Language Models for Code Generation

Large language models (LLMs) have demonstrated notable proficiency in code generation, with numerous prior studies showing their promising capabilities in various development scenarios. However, these studies mainly provide evaluations in…

Software Engineering · Computer Science 2024-03-19 Kailun Jin , Chung-Yu Wang , Hung Viet Pham , Hadi Hemmati

DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models

The automated program repair field has attracted substantial interest over the years, but despite significant research efforts, creating a system that works well for complex semantic bugs such as security vulnerabilities has proven…

Cryptography and Security · Computer Science 2024-02-26 Berkay Berabi , Alexey Gronskiy , Veselin Raychev , Gishor Sivanrupan , Victor Chibotaru , Martin Vechev

Enhancing Debugging Skills with AI-Powered Assistance: A Real-Time Tool for Debugging Support

Debugging is a crucial skill in programming education and software development, yet it is often overlooked in CS curricula. To address this, we introduce an AI-powered debugging assistant integrated into an IDE. It offers real-time support…

Software Engineering · Computer Science 2026-01-07 Elizaveta Artser , Daniil Karol , Anna Potriasaeva , Aleksei Rostovskii , Katsiaryna Dzialets , Ekaterina Koshchenko , Xiaotian Su , April Yi Wang , Anastasiia Birillo

Can LLMs Demystify Bug Reports?

Bugs are notoriously challenging: they slow down software users and result in time-consuming investigations for developers. These challenges are exacerbated when bugs must be reported in natural language by users. Indeed, we lack reliable…

Software Engineering · Computer Science 2023-10-11 Laura Plein , Tegawendé F. Bissyandé

Is ChatGPT the Ultimate Programming Assistant -- How far is it?

Recently, the ChatGPT LLM has received great attention: it can be used as a bot for discussing source code, prompting it to suggest changes, provide descriptions or even generate code. Typical demonstrations generally focus on existing…

Software Engineering · Computer Science 2023-09-01 Haoye Tian , Weiqi Lu , Tsz On Li , Xunzhu Tang , Shing-Chi Cheung , Jacques Klein , Tegawendé F. Bissyandé

Teaching Large Language Models to Self-Debug

Large language models (LLMs) have achieved impressive performance on code generation. However, for complex programming tasks, generating the correct solution in one go becomes challenging, thus some prior works have designed program repair…

Computation and Language · Computer Science 2023-10-06 Xinyun Chen , Maxwell Lin , Nathanael Schärli , Denny Zhou

Can LLMs Find Bugs in Code? An Evaluation from Beginner Errors to Security Vulnerabilities in Python and C++

Large Language Models (LLMs) such as ChatGPT-4, Claude 3, and LLaMA 4 are increasingly embedded in software/application development, supporting tasks from code generation to debugging. Yet, their real-world effectiveness in detecting…

Software Engineering · Computer Science 2026-04-28 Akshay Mhatre , Noujoud Nader , Patrick Diehl , Deepti Gupta