Related papers: Language Models are Better Bug Detector Through Co…

Empirical Evaluation of Large Language Models in Automated Program Repair

The increasing prevalence of software bugs has made automated program repair (APR) a key research focus. Large language models (LLMs) offer new opportunities for APR, but existing studies mostly rely on smaller, earlier-generation models…

Software Engineering · Computer Science 2025-06-17 Jiajun Sun , Fengjie Li , Xinzhu Qi , Hongyu Zhang , Jiajun Jiang

Can LLMs Find Bugs in Code? An Evaluation from Beginner Errors to Security Vulnerabilities in Python and C++

Large Language Models (LLMs) such as ChatGPT-4, Claude 3, and LLaMA 4 are increasingly embedded in software/application development, supporting tasks from code generation to debugging. Yet, their real-world effectiveness in detecting…

Software Engineering · Computer Science 2026-04-28 Akshay Mhatre , Noujoud Nader , Patrick Diehl , Deepti Gupta

Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models

Identifying and resolving logic errors can be one of the most frustrating challenges for novices programmers. Unlike syntax errors, for which a compiler or interpreter can issue a message, logic errors can be subtle. In certain conditions,…

Human-Computer Interaction · Computer Science 2023-11-28 Stephen MacNeil , Paul Denny , Andrew Tran , Juho Leinonen , Seth Bernstein , Arto Hellas , Sami Sarsa , Joanne Kim

A Comparative Study on Large Language Models for Log Parsing

Background: Log messages provide valuable information about the status of software systems. This information is provided in an unstructured fashion and automated approaches are applied to extract relevant parameters. To ease this process,…

Software Engineering · Computer Science 2024-09-05 Merve Astekin , Max Hort , Leon Moonen

Fine-Tuning Code Language Models to Detect Cross-Language Bugs

Multilingual programming, which involves using multiple programming languages (PLs) in a single project, is increasingly common due to its benefits. However, it introduces cross-language bugs (CLBs), which arise from interactions between…

Software Engineering · Computer Science 2026-04-22 Zengyang Li , Yimeng Li , Binbin Huang , Peng Liang , Ran Mo , Hui Liu , Yutao Ma

Large Language Model for Vulnerability Detection: Emerging Results and Future Directions

Previous learning-based vulnerability detection methods relied on either medium-sized pre-trained models or smaller neural networks from scratch. Recent advancements in Large Pre-Trained Language Models (LLMs) have showcased remarkable…

Software Engineering · Computer Science 2024-01-30 Xin Zhou , Ting Zhang , David Lo

Debugging with Open-Source Large Language Models: An Evaluation

Large language models have shown good potential in supporting software development tasks. This is why more and more developers turn to LLMs (e.g. ChatGPT) to support them in fixing their buggy code. While this can save time and effort, many…

Software Engineering · Computer Science 2024-09-06 Yacine Majdoub , Eya Ben Charrada

Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

The growing trend of vulnerability issues in software development as a result of a large dependence on open-source projects has received considerable attention recently. This paper investigates the effectiveness of Large Language Models…

Software Engineering · Computer Science 2024-09-17 Shaznin Sultana , Sadia Afreen , Nasir U. Eisty

An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code

Large language models (LLMs) have demonstrated strong performance on a wide range of software engineering tasks, including code generation and analysis. However, most prior work relies on cloud-based models or specialized hardware, limiting…

Software Engineering · Computer Science 2026-04-28 Jelena Ilić Vulićević

Assessing the Code Clone Detection Capability of Large Language Models

This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. The evaluation involves testing the models on a variety of code pairs of different clone types…

Software Engineering · Computer Science 2024-07-03 Zixian Zhang , Takfarinas Saber

Leveraging Print Debugging to Improve Code Generation in Large Language Models

Large language models (LLMs) have made significant progress in code generation tasks, but their performance in tackling programming problems with complex data structures and algorithms remains suboptimal. To address this issue, we propose…

Computation and Language · Computer Science 2024-01-11 Xueyu Hu , Kun Kuang , Jiankai Sun , Hongxia Yang , Fei Wu

LLMs are Bug Replicators: An Empirical Study on LLMs' Capability in Completing Bug-prone Code

Large Language Models (LLMs) have demonstrated remarkable performance in code completion. However, the training data used to develop these models often contain a significant amount of buggy code. Yet, it remains unclear to what extent these…

Software Engineering · Computer Science 2025-03-17 Liwei Guo , Sixiang Ye , Zeyu Sun , Xiang Chen , Yuxia Zhang , Bo Wang , Jie M. Zhang , Zheng Li , Yong Liu

Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code…

Software Engineering · Computer Science 2024-01-17 Rangeet Pan , Ali Reza Ibrahimzada , Rahul Krishna , Divya Sankar , Lambert Pouguem Wassi , Michele Merler , Boris Sobolev , Raju Pavuluri , Saurabh Sinha , Reyhaneh Jabbarvand

DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models

The automated program repair field has attracted substantial interest over the years, but despite significant research efforts, creating a system that works well for complex semantic bugs such as security vulnerabilities has proven…

Cryptography and Security · Computer Science 2024-02-26 Berkay Berabi , Alexey Gronskiy , Veselin Raychev , Gishor Sivanrupan , Victor Chibotaru , Martin Vechev

Efficient Classification of Student Help Requests in Programming Courses Using Large Language Models

The accurate classification of student help requests with respect to the type of help being sought can enable the tailoring of effective responses. Automatically classifying such requests is non-trivial, but large language models (LLMs)…

Computers and Society · Computer Science 2023-11-01 Jaromir Savelka , Paul Denny , Mark Liffiton , Brad Sheese

DebugBench: Evaluating Debugging Capability of Large Language Models

Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs'…

Software Engineering · Computer Science 2024-06-07 Runchu Tian , Yining Ye , Yujia Qin , Xin Cong , Yankai Lin , Yinxu Pan , Yesai Wu , Haotian Hui , Weichuan Liu , Zhiyuan Liu , Maosong Sun

Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey

Code cloning, the duplication of code fragments, is common in software development. While some reuse aids productivity, excessive cloning hurts maintainability and introduces bugs. Hence, automatic code clone detection is vital. Meanwhile,…

Software Engineering · Computer Science 2023-08-08 Shihan Dou , Junjie Shan , Haoxiang Jia , Wenhao Deng , Zhiheng Xi , Wei He , Yueming Wu , Tao Gui , Yang Liu , Xuanjing Huang

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

While automated vulnerability detection techniques have made promising progress in detecting security vulnerabilities, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such…

Cryptography and Security · Computer Science 2024-10-24 Avishree Khare , Saikat Dutta , Ziyang Li , Alaia Solko-Breslin , Rajeev Alur , Mayur Naik

Test Case Generation from Bug Reports via Large Language Models: A Cognitive Layered Evaluation Framework

Large Language Models (LLMs) are increasingly applied to automated software testing, yet their ability to generalize beyond memorized patterns and reason about natural language bug reports remains unclear. We present a systematic evaluation…

Software Engineering · Computer Science 2025-10-08 Irtaza Sajid Qureshi , Zhen Ming , Jiang

Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

Qualitative analysis of textual contents unpacks rich and valuable information by assigning labels to the data. However, this process is often labor-intensive, particularly when working with large datasets. While recent AI-based tools…

Computation and Language · Computer Science 2023-04-24 Ziang Xiao , Xingdi Yuan , Q. Vera Liao , Rania Abdelghani , Pierre-Yves Oudeyer