English
Related papers

Related papers: Evaluating Source Code Quality with Large Language…

200 papers

This study presents a quantitative evaluation of the code quality and security of five prominent Large Language Models (LLMs): Claude Sonnet 4, Claude 3.7 Sonnet, GPT-4o, Llama 3.2 90B, and OpenCoder 8B. While prior research has assessed…

Software Engineering · Computer Science 2025-08-21 Abbas Sabra , Olivier Schmitt , Joseph Tyler

As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality…

Background: The rise of Large Language Models (LLMs) in software development has opened new possibilities for code generation. Despite the widespread use of this technology, it remains unclear how well LLMs generate code solutions in terms…

Software Engineering · Computer Science 2025-08-04 Alfred Santa Molison , Marcia Moraes , Glaucia Melo , Fabio Santos , Wesley K. G. Assuncao

Large Language Models (LLMs) promise to streamline software code reviews, but their ability to produce consistent assessments remains an open question. In this study, we tested four leading LLMs -- GPT-4o mini, GPT-4o, Claude 3.5 Sonnet,…

Software Engineering · Computer Science 2025-03-03 Eugene Klishevich , Yegor Denisov-Blanch , Simon Obstbaum , Igor Ciobanu , Michal Kosinski

Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…

Software Engineering · Computer Science 2026-05-22 Wei Ma , Zhihao Lin , Shangqing Liu , Qiang Hu , Ye Liu , Wenhan Wang , Cen Zhang , Liming Nie , Li Li , Yang Liu , Lingxiao Jiang

Code readability is one of the main aspects of code quality, influenced by various properties like identifier names, comments, code structure, and adherence to standards. However, measuring this attribute poses challenges in both industry…

Software Engineering · Computer Science 2025-10-21 Igor Regis da Silva Simoes , Elaine Venson

Large language models (LLMs) have demonstrated impressive capabilities in code generation, achieving high scores on benchmarks such as HumanEval and MBPP. However, these benchmarks primarily assess functional correctness and neglect broader…

Software Engineering · Computer Science 2025-08-21 Scott Blyth , Sherlock A. Licorish , Christoph Treude , Markus Wagner

Large language models (LLMs) are increasingly used for automated code refactoring tasks. Although these models can quickly refactor code, the quality may exhibit inconsistencies and unpredictable behavior. In this article, we systematically…

Software Engineering · Computer Science 2026-02-26 Norman Peitek , Julia Hess , Sven Apel

Modern software relies on a multitude of automated testing and quality assurance tools to prevent errors, bugs and potential vulnerabilities. This study sets out to provide a head-to-head, quantitative and qualitative evaluation of six…

Software Engineering · Computer Science 2025-08-07 Damian Gnieciak , Tomasz Szandala

As software projects become increasingly complex, the volume and variety of issues in code files have grown substantially. Addressing this challenge requires efficient issue detection, resolution, and evaluation tools. This paper presents…

Software Engineering · Computer Science 2025-09-15 Seyed Moein Abtahi , Akramul Azim

Large Language Models (LLMs) are widely used in software engineering to generate, complete, translate, and fix code, improving developer productivity. While most research focuses on the energy consumption and carbon emissions of model…

Software Engineering · Computer Science 2026-04-15 Sabiya Banu Masthan Ali , Oussema Kirmani , Aroosa Hameed , Syed Muhammad Danish , Gautam Srivastava

This study examined code issue detection and revision automation by integrating Large Language Models (LLMs) such as OpenAI's GPT-3.5 Turbo and GPT-4o into software development workflows. A static code analysis framework detects issues such…

Software Engineering · Computer Science 2025-06-13 Seyed Moein Abtahi , Akramul Azim

Despite various approaches being employed to detect vulnerabilities, the number of reported vulnerabilities shows an upward trend over the years. This suggests the problems are not caught before the code is released, which could be caused…

Cryptography and Security · Computer Science 2025-02-14 Karl Tamberg , Hayretdin Bahsi

Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current…

Cryptography and Security · Computer Science 2025-11-14 Shih-Chieh Dai , Jun Xu , Guanhong Tao

Software code quality is a construct with three dimensions: maintainability, reliability, and functionality. Although many firms have incorporated code quality metrics in their operations, evaluating these metrics still lacks consistent…

Software Engineering · Computer Science 2024-01-17 Siyuan Jin , Mianmian Zhang , Yekai Guo , Yuejiang He , Ziyuan Li , Bichao Chen , Bing Zhu , Yong Xia

In software practice, static analysis tools remain an integral part of detecting defects in software and there have been various tools designed to run the analysis in different programming languages like Java, C++, and Python. This paper…

Software Engineering · Computer Science 2024-05-22 Jones Yeboah , Saheed Popoola

Large Language Models (LLMs) are one of the most promising developments in the field of artificial intelligence, and the software engineering community has readily noticed their potential role in the software development life-cycle.…

Software Engineering · Computer Science 2026-03-16 Greta Dolcetti , Vincenzo Arceri , Eleonora Iotti , Sergio Maffeis , Agostino Cortesi , Enea Zaffanella

Code readability is crucial for software comprehension and maintenance, yet difficult to assess at scale. Traditional static metrics often fail to capture the subjective, context-sensitive nature of human judgments. Large Language Models…

Code smells are symptoms of potential code quality problems that may affect software maintainability, thus increasing development costs and impacting software reliability. Large language models (LLMs) have shown remarkable capabilities for…

Software Engineering · Computer Science 2026-01-16 Saymon Souza , Amanda Santana , Eduardo Figueiredo , Igor Muzetti , João Eduardo Montandon , Lionel Briand

Using large language models (LLMs) for source code has recently gained attention. LLMs, such as Transformer-based models like Codex and ChatGPT, have been shown to be highly capable of solving a wide range of programming problems. However,…

Computation and Language · Computer Science 2023-06-27 Atsushi Shirafuji , Yutaka Watanobe , Takumi Ito , Makoto Morishita , Yuki Nakamura , Yusuke Oda , Jun Suzuki
‹ Prev 1 2 3 10 Next ›