Related papers: Validating LLM-Generated Programs with Metamorphic…

Understanding Defects in Generated Codes by Language Models

This study investigates the reliability of code generation by Large Language Models (LLMs), focusing on identifying and analyzing defects in the generated code. Despite the advanced capabilities of LLMs in automating code generation,…

Software Engineering · Computer Science 2024-08-27 Ali Mohammadi Esfahani , Nafiseh Kahani , Samuel A. Ajila

Testing LLMs on Code Generation with Varying Levels of Prompt Specificity

Large language models (LLMs) have demonstrated unparalleled prowess in mimicking human-like text generation and processing. Among the myriad of applications that benefit from LLMs, automated code generation is increasingly promising. The…

Software Engineering · Computer Science 2023-11-15 Lincoln Murr , Morgan Grainger , David Gao

The Impact of Prompt Programming on Function-Level Code Generation

Large Language Models (LLMs) are increasingly used by software engineers for code generation. However, limitations of LLMs such as irrelevant or incorrect code have highlighted the need for prompt programming (or prompt engineering) where…

Software Engineering · Computer Science 2025-07-09 Ranim Khojah , Francisco Gomes de Oliveira Neto , Mazen Mohamad , Philipp Leitner

LLM4VV: Evaluating Cutting-Edge LLMs for Generation and Evaluation of Directive-Based Parallel Programming Model Compiler Tests

The usage of Large Language Models (LLMs) for software and test development has continued to increase since LLMs were first introduced, but only recently have the expectations of LLMs become more realistic. Verifying the correctness of code…

Software Engineering · Computer Science 2025-08-20 Zachariah Sollenberger , Rahul Patel , Saieda Ali Zada , Sunita Chandrasekaran

Generative transformations and patterns in LLM-native approaches for software verification and falsification

The emergence of prompting as the dominant paradigm for leveraging Large Language Models (LLMs) has led to a proliferation of LLM-native software, where application behavior arises from complex, stochastic data transformations. However, the…

Software Engineering · Computer Science 2025-10-08 Víctor A. Braberman , Flavia Bonomo-Braberman , Yiannis Charalambous , Juan G. Colonna , Lucas C. Cordeiro , Rosiane de Freitas

LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems

Although Large Language Models (LLMs) have established pre-dominance in automated code generation, they are not devoid of shortcomings. The pertinent issues primarily relate to the absence of execution guarantees for generated code, a lack…

Software Engineering · Computer Science 2024-01-12 Mohamad Fakih , Rahul Dharmaji , Yasamin Moghaddas , Gustavo Quiros Araya , Oluwatosin Ogundare , Mohammad Abdullah Al Faruque

Exploring and Lifting the Robustness of LLM-powered Automated Program Repair with Metamorphic Testing

In recent years, Large language model-powered Automated Program Repair (LAPR) techniques have achieved state-of-the-art bug-fixing performance and have been pervasively applied and studied in both industry and academia. Nonetheless, LLMs…

Software Engineering · Computer Science 2025-03-11 Pengyu Xue , Linhao Wu , Zhen Yang , Zhongxing Yu , Zhi Jin , Ge Li , Yan Xiao , Shuo Liu , Xinyi Li , Hongyi Lin , Jingwen Wu

Metamorphic Malware Evolution: The Potential and Peril of Large Language Models

Code metamorphism refers to a computer programming exercise wherein the program modifies its own code (partial or entire) consistently and automatically while retaining its core functionality. This technique is often used for online…

Cryptography and Security · Computer Science 2024-11-05 Pooria Madani

Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization

Large Language Models (LLMs) are nowadays extensively used for various types of software engineering tasks, primarily code generation. Previous research has shown how suitable prompt engineering could help developers in improving their code…

Software Engineering · Computer Science 2026-01-21 Alessandro Midolo , Alessandro Giagnorio , Fiorella Zampetti , Rosalia Tufano , Gabriele Bavota , Massimiliano Di Penta

Bidirectional Empowerment of Metamorphic Testing and Large Language Models: A Systematic Survey

Large language models (LLMs) have introduced substantial challenges to software quality assurance due to their generative, probabilistic, and open-ended nature, which intensifies the oracle problem and limits the applicability of…

Software Engineering · Computer Science 2026-05-15 Zheng Zheng , Zenghui Zhou , Yinwang Xu , Daixu Ren , Tsong Yueh Chen

Interactions with Prompt Problems: A New Way to Teach Programming with Large Language Models

Large Language Models (LLMs) have upended decades of pedagogy in computing education. Students previously learned to code through \textit{writing} many small problems with less emphasis on code reading and comprehension. Recent research has…

Human-Computer Interaction · Computer Science 2024-01-22 James Prather , Paul Denny , Juho Leinonen , David H. Smith , Brent N. Reeves , Stephen MacNeil , Brett A. Becker , Andrew Luxton-Reilly , Thezyrie Amarouche , Bailey Kimmel

Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis

In the past few years, Large Language Models (LLMs) have exploded in usefulness and popularity for code generation tasks. However, LLMs still struggle with accuracy and are unsuitable for high-risk applications without additional oversight…

Software Engineering · Computer Science 2024-10-29 William Murphy , Nikolaus Holzer , Feitong Qiao , Leyi Cui , Raven Rothkopf , Nathan Koenig , Mark Santolucito

LLM4VV: Developing LLM-Driven Testsuite for Compiler Validation

Large language models (LLMs) are a new and powerful tool for a wide span of applications involving natural language and demonstrate impressive code generation abilities. The goal of this work is to automatically generate tests and use these…

Artificial Intelligence · Computer Science 2024-03-12 Christian Munley , Aaron Jarmusch , Sunita Chandrasekaran

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code. Programming benchmarks, with curated synthesis problems and test-cases, are used to measure…

Software Engineering · Computer Science 2023-11-01 Jiawei Liu , Chunqiu Steven Xia , Yuyao Wang , Lingming Zhang

Test-Driven Development for Code Generation

Recent Large Language Models (LLMs) have demonstrated significant capabilities in generating code snippets directly from problem statements. This increasingly automated process mirrors traditional human-led software development, where code…

Software Engineering · Computer Science 2024-10-23 Noble Saji Mathews , Meiyappan Nagappan

Prompting Techniques for Secure Code Generation: A Systematic Investigation

Large Language Models (LLMs) are gaining momentum in software development with prompt-driven programming enabling developers to create code from natural language (NL) instructions. However, studies have questioned their ability to produce…

Software Engineering · Computer Science 2025-02-27 Catherine Tony , Nicolás E. Díaz Ferreyra , Markus Mutas , Salem Dhiff , Riccardo Scandariato

Examination of Code generated by Large Language Models

Large language models (LLMs), such as ChatGPT and Copilot, are transforming software development by automating code generation and, arguably, enable rapid prototyping, support education, and boost productivity. Therefore, correctness and…

Software Engineering · Computer Science 2024-08-30 Robin Beer , Alexander Feix , Tim Guttzeit , Tamara Muras , Vincent Müller , Maurice Rauscher , Florian Schäffler , Welf Löwe

Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models

Prompt engineering reduces reasoning mistakes in Large Language Models (LLMs). However, its effectiveness in mitigating vulnerabilities in LLM-generated code remains underexplored. To address this gap, we implemented a benchmark to…

Software Engineering · Computer Science 2025-02-11 Marc Bruni , Fabio Gabrielli , Mohammad Ghafari , Martin Kropp

Metamorphic Testing of Large Language Models for Natural Language Processing

Using large language models (LLMs) to perform natural language processing (NLP) tasks has become increasingly pervasive in recent times. The versatile nature of LLMs makes them applicable to a wide range of such tasks. While the performance…

Software Engineering · Computer Science 2026-01-12 Steven Cho , Stefano Ruberto , Valerio Terragni

Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation

Large language models (LLMs) and prompt engineering hold significant potential for advancing computer programming education through personalized instruction. This paper explores this potential by investigating three critical research…

Artificial Intelligence · Computer Science 2024-07-09 Tianyu Wang , Nianjun Zhou , Zhixiong Chen