Related papers: Mull it over: mutation testing based on LLVM

MILE: A Mutation Testing Framework of In-Context Learning Systems

In-context Learning (ICL) has achieved notable success in the applications of large language models (LLMs). By adding only a few input-output pairs that demonstrate a new task, the LLM can efficiently learn the task during inference without…

Software Engineering · Computer Science 2024-09-10 Zeming Wei , Yihao Zhang , Meng Sun

MMT: Mutation Testing of Java Bytecode with Model Transformation -- An Illustrative Demonstration

Mutation testing is an approach to check the robustness of test suites. The program code is slightly changed by mutations to inject errors. A test suite is robust enough if it finds such errors. Tools for mutation testing usually integrate…

Software Engineering · Computer Science 2024-04-23 Christoph Bockisch , Gabriele Taentzer , Daniel Neufeld

LLMorpheus: Mutation Testing using Large Language Models

In mutation testing, the quality of a test suite is evaluated by introducing faults into a program and determining whether the program's tests detect them. Most existing approaches for mutation testing involve the application of a fixed set…

Software Engineering · Computer Science 2025-03-10 Frank Tip , Jonathan Bell , Max Schaefer

Mutation-based Consistency Testing for Evaluating the Code Understanding Capability of LLMs

Large Language Models (LLMs) have shown remarkable capabilities in processing both natural and programming languages, which have enabled various applications in software engineering, such as requirement engineering, code generation, and…

Software Engineering · Computer Science 2024-01-12 Ziyu Li , Donghwan Shin

Mutation Testing via Iterative Large Language Model-Driven Scientific Debugging

Large Language Models (LLMs) can generate plausible test code. Intuitively they generate this by imitating tests seen in their training data, rather than reasoning about execution semantics. However, such reasoning is important when…

Software Engineering · Computer Science 2025-03-12 Philipp Straubinger , Marvin Kreis , Stephan Lukasczyk , Gordon Fraser

What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection

Mutation testing is a well-established technique for assessing a test suite's quality by injecting artificial faults into production code. In recent years, mutation testing has been extended to machine learning (ML) systems, and deep…

Software Engineering · Computer Science 2021-03-03 Annibale Panichella , Cynthia C. S. Liem

Towards Translating Real-World Code with LLMs: A Study of Translating to Rust

Large language models (LLMs) show promise in code translation - the task of translating code written in one programming language to another language - due to their ability to write code in most programming languages. However, LLM's…

Software Engineering · Computer Science 2025-04-18 Hasan Ferit Eniser , Hanliang Zhang , Cristina David , Meng Wang , Maria Christakis , Brandon Paulsen , Joey Dodds , Daniel Kroening

Mutation Testing of Deep Reinforcement Learning Based on Real Faults

Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional systems would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such as Mutation Testing…

Machine Learning · Computer Science 2023-01-16 Florian Tambon , Vahid Majdinasab , Amin Nikanjam , Foutse Khomh , Giuliano Antonio

DeepMutation: Mutation Testing of Deep Learning Systems

Deep learning (DL) defines a new data-driven programming paradigm where the internal system logic is largely shaped by the training data. The standard way of evaluating DL models is to examine their performance on a test dataset. The…

Software Engineering · Computer Science 2018-08-16 Lei Ma , Fuyuan Zhang , Jiyuan Sun , Minhui Xue , Bo Li , Felix Juefei-Xu , Chao Xie , Li Li , Yang Liu , Jianjun Zhao , Yadong Wang

Building Bridges: Julia as an MLIR Frontend

Driven by increasing compute requirements for deep learning models, compiler developers have been looking for ways to target specialised hardware and heterogeneous systems more efficiently. The MLIR project has the goal to offer…

Programming Languages · Computer Science 2025-03-10 Jules Merckx

How Multi-Modal LLMs Reshape Visual Deep Learning Testing? A Comprehensive Study Through the Lens of Image Mutation

Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing,…

Software Engineering · Computer Science 2024-12-24 Liwen Wang , Yuanyuan Yuan , Ao Sun , Zongjie Li , Pingchuan Ma , Daoyuan Wu , Shuai Wang

Mutation-Guided Unit Test Generation with a Large Language Model

Unit tests play a vital role in uncovering potential faults in software. While tools like EvoSuite focus on maximizing code coverage, recent advances in large language models (LLMs) have shifted attention toward LLM-based test generation.…

Software Engineering · Computer Science 2026-04-17 Guancheng Wang , Qinghua Xu , Lionel Briand , Kui Liu

OVAL: the CMS Testing Robot

Oval is a testing tool which help developers to detect unexpected changes in the behavior of their software. It is able to automatically compile some test programs, to prepare on the fly the needed configuration files, to run the tests…

Software Engineering · Computer Science 2007-05-23 D. Chamont , C. Charlot

MIST-RL: Mutation-based Incremental Suite Testing via Reinforcement Learning

Large Language Models (LLMs) often fail to generate correct code on the first attempt, which requires using generated unit tests as verifiers to validate the solutions. Despite the success of recent verification methods, they remain…

Artificial Intelligence · Computer Science 2026-03-03 Sicheng Zhu , Jiajun Wang , Jiawei Ai , Xin Li

MLIR-Smith: A Novel Random Program Generator for Evaluating Compiler Pipelines

Compilers are essential for the performance and correct execution of software and hold universal relevance across various scientific disciplines. Despite this, there is a notable lack of tools for testing and evaluating them, especially…

Programming Languages · Computer Science 2026-01-06 Berke Ates , Filip Dobrosavljević , Theodoros Theodoridis , Zhendong Su

Test It Before You Trust It: Applying Software Testing for Trustworthy In-context Learning

In-context learning (ICL) has emerged as a powerful capability of large language models (LLMs), enabling them to perform new tasks based on a few provided examples without explicit fine-tuning. Despite their impressive adaptability, these…

Software Engineering · Computer Science 2025-09-09 Teeradaj Racharak , Chaiyong Ragkhitwetsagul , Chommakorn Sontesadisai , Thanwadee Sunetnanta

LittleDarwin: a Feature-Rich and Extensible Mutation Testing Framework for Large and Complex Java Systems

Mutation testing is a well-studied method for increasing the quality of a test suite. We designed LittleDarwin as a mutation testing framework able to cope with large and complex Java software systems, while still being easily extensible…

Software Engineering · Computer Science 2017-07-06 Ali Parsai , Alessandro Murgia , Serge Demeyer

Reasoning About LLVM Code Using Codewalker

This paper reports on initial experiments using J Moore's Codewalker to reason about programs compiled to the Low-Level Virtual Machine (LLVM) intermediate form. Previously, we reported on a translator from LLVM to the applicative subset of…

Logic in Computer Science · Computer Science 2015-09-22 David S. Hardin

A Comprehensive Study on Large Language Models for Mutation Testing

Large Language Models (LLMs) have recently been used to generate mutants in both research work and in industrial practice. However, there has been no comprehensive empirical study of their performance for this increasingly important…

Software Engineering · Computer Science 2026-01-23 Bo Wang , Mingda Chen , Ming Deng , Youfang Lin , Mark Harman , Mike Papadakis , Jie M. Zhang

LLMalMorph: On The Feasibility of Generating Variant Malware using Large-Language-Models

Large Language Models (LLMs) have transformed software development and automated code generation. Motivated by these advancements, this paper explores the feasibility of LLMs in modifying malware source code to generate variants. We…

Cryptography and Security · Computer Science 2025-10-07 Md Ajwad Akil , Adrian Shuai Li , Imtiaz Karim , Arun Iyengar , Ashish Kundu , Vinny Parla , Elisa Bertino