Related papers: MuFF: Stable and Sensitive Post-training Mutation …

An Empirical Study of the Realism of Mutants in Deep Learning

Mutation analysis is a well-established technique for assessing test quality in the traditional software development paradigm by injecting artificial faults into programs. Its application to deep learning (DL) has expanded beyond classical…

Software Engineering · Computer Science 2025-12-19 Zaheed Ahmed , Philip Makedonski , Jens Grabowski

Quality-Driven Selective Mutation for Deep Learning

Mutants support testing and debugging in two roles: (i) as test goals and (ii) as substitutes for real faults. Hard-to-kill mutants provide better guidance for test improvement, while realism is essential when mutants are used to simulate…

Software Engineering · Computer Science 2026-04-27 Zaheed Ahmed , Emmanuel Charleson Dapaah , Philip Makedonski , Jens Grabowski

Mutation Testing of Deep Reinforcement Learning Based on Real Faults

Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional systems would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such as Mutation Testing…

Machine Learning · Computer Science 2023-01-16 Florian Tambon , Vahid Majdinasab , Amin Nikanjam , Foutse Khomh , Giuliano Antonio

A Probabilistic Framework for Mutation Testing in Deep Neural Networks

Context: Mutation Testing (MT) is an important tool in traditional Software Engineering (SE) white-box testing. It aims to artificially inject faults in a system to evaluate a test suite's capability to detect them, assuming that the test…

Software Engineering · Computer Science 2023-01-16 Florian Tambon , Foutse Khomh , Giuliano Antoniol

muPRL: A Mutation Testing Pipeline for Deep Reinforcement Learning based on Real Faults

Reinforcement Learning (RL) is increasingly adopted to train agents that can deal with complex sequential tasks, such as driving an autonomous vehicle or controlling a humanoid robot. Correspondingly, novel approaches are needed to ensure…

Software Engineering · Computer Science 2024-08-28 Deepak-George Thomas , Matteo Biagiola , Nargiz Humbatova , Mohammad Wardat , Gunel Jahangirova , Hridesh Rajan , Paolo Tonella

DeepMutation: Mutation Testing of Deep Learning Systems

Deep learning (DL) defines a new data-driven programming paradigm where the internal system logic is largely shaped by the training data. The standard way of evaluating DL models is to examine their performance on a test dataset. The…

Software Engineering · Computer Science 2018-08-16 Lei Ma , Fuyuan Zhang , Jiyuan Sun , Minhui Xue , Bo Li , Felix Juefei-Xu , Chao Xie , Li Li , Yang Liu , Jianjun Zhao , Yadong Wang

Using Fourier Analysis and Mutant Clustering to Accelerate DNN Mutation Testing

Deep neural network (DNN) mutation analysis is a promising approach to evaluating test set adequacy. Due to the large number of generated mutants that must be tested on large datasets, mutation analysis is costly. In this paper, we present…

Software Engineering · Computer Science 2025-10-06 Ali Ghanbari , Sasan Tavakkol

Mutation-based Fault Localization of Deep Neural Networks

Deep neural networks (DNNs) are susceptible to bugs, just like other types of software systems. A significant uptick in using DNN, and its applications in wide-ranging areas, including safety-critical systems, warrant extensive research on…

Software Engineering · Computer Science 2023-09-12 Ali Ghanbari , Deepak-George Thomas , Muhammad Arbab Arshad , Hridesh Rajan

Deep Learning Framework Testing via Model Mutation: How Far Are We?

Deep Learning (DL) frameworks are a fundamental component of DL development. Therefore, the detection of DL framework defects is important and challenging. As one of the most widely adopted DL testing techniques, model mutation has recently…

Software Engineering · Computer Science 2025-07-08 Yanzhou Mu , Rong Wang , Juan Zhai , Chunrong Fang , Xiang Chen , Zhiyuan Peng , Peiran Yang , Ruixiang Qian , Shaoyu Yang , Zhenyu Chen

What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection

Mutation testing is a well-established technique for assessing a test suite's quality by injecting artificial faults into production code. In recent years, mutation testing has been extended to machine learning (ML) systems, and deep…

Software Engineering · Computer Science 2021-03-03 Annibale Panichella , Cynthia C. S. Liem

Beyond Force Metrics: Pre-Training MLFFs for Stable MD Simulations

Machine-learning force fields (MLFFs) have emerged as a promising solution for speeding up ab initio molecular dynamics (MD) simulations, where accurate force predictions are critical but often computationally expensive. In this work, we…

Chemical Physics · Physics 2025-12-22 Shagun Maheshwari , Zhengxian Tang , Janghoon Ock , Adeesh Kolluru , Amir Barati Farimani , John R. Kitchin

Exploring Robustness of Image Recognition Models on Hardware Accelerators

As the usage of Artificial Intelligence (AI) on resource-intensive and safety-critical tasks increases, a variety of Machine Learning (ML) compilers have been developed, enabling compatibility of Deep Neural Networks (DNNs) with a variety…

Machine Learning · Computer Science 2025-03-26 Nikolaos Louloudakis , Perry Gibson , José Cano , Ajitha Rajan

DevMuT: Testing Deep Learning Framework via Developer Expertise-Based Mutation

Deep learning (DL) frameworks are the fundamental infrastructure for various DL applications. Framework defects can profoundly cause disastrous accidents, thus requiring sufficient detection. In previous studies, researchers adopt DL models…

Software Engineering · Computer Science 2025-07-08 Yanzhou Mu , Juan Zhai , Chunrong Fang , Xiang Chen , Zhixiang Cao , Peiran Yang , Yinglong Zou , Tao Zheng , Zhenyu Chen

Stability-Aware Training of Machine Learning Force Fields with Differentiable Boltzmann Estimators

Machine learning force fields (MLFFs) are an attractive alternative to ab-initio methods for molecular dynamics (MD) simulations. However, they can produce unstable simulations, limiting their ability to model phenomena occurring over…

Machine Learning · Computer Science 2025-02-26 Sanjeev Raja , Ishan Amin , Fabian Pedregosa , Aditi S. Krishnapriyan

DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation Score

Deep Learning (DL) components are routinely integrated into software systems that need to perform complex tasks such as image or natural language processing. The adequacy of the test data used to test such systems can be assessed by their…

Software Engineering · Computer Science 2021-09-17 Vincenzo Riccio , Nargiz Humbatova , Gunel Jahangirova , Paolo Tonella

On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering

Mutation analysis of deep neural networks (DNNs) is a promising method for effective evaluation of test data quality and model robustness, but it can be computationally expensive, especially for large models. To alleviate this, we present…

Software Engineering · Computer Science 2025-01-23 Lauren Lyons , Ali Ghanbari

Leveraging Propagated Infection to Crossfire Mutants

Mutation testing was proposed to identify weaknesses in test suites by repeatedly generating artificially faulty versions of the software (mutants) and determining if the test suite is sufficient to detect them (kill them). When the tests…

Software Engineering · Computer Science 2024-11-18 Hang Du , Vijay Krishna Palepu , James A. Jones

DeepMutation: A Neural Mutation Tool

Mutation testing can be used to assess the fault-detection capabilities of a given test suite. To this aim, two characteristics of mutation testing frameworks are of paramount importance: (i) they should generate mutants that are…

Software Engineering · Computer Science 2020-02-14 Michele Tufano , Jason Kimko , Shiya Wang , Cody Watson , Gabriele Bavota , Massimiliano Di Penta , Denys Poshyvanyk

Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing

Deep learning (DL) techniques are proven effective in many challenging tasks, and become widely-adopted in practice. However, previous work has shown that DL libraries, the basis of building and executing DL models, contain bugs and can…

Software Engineering · Computer Science 2022-05-10 Jiazhen Gu , Xuchuan Luo , Yangfan Zhou , Xin Wang

How Multi-Modal LLMs Reshape Visual Deep Learning Testing? A Comprehensive Study Through the Lens of Image Mutation

Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing,…

Software Engineering · Computer Science 2024-12-24 Liwen Wang , Yuanyuan Yuan , Ao Sun , Zongjie Li , Pingchuan Ma , Daoyuan Wu , Shuai Wang