Related papers: Enhancing Differential Testing With LLMs For Testi…

Understanding Bugs in Multi-Language Deep Learning Frameworks

Deep learning frameworks (DLFs) have been playing an increasingly important role in this intelligence age since they act as a basic infrastructure for an increasingly wide range of AIbased applications. Meanwhile, as…

Software Engineering · Computer Science 2023-03-07 Zengyang Li , Sicheng Wang , Wenshuo Wang , Peng Liang , Ran Mo , Bing Li

Deep Learning Library Testing: Definition, Methods and Challenges

In recent years, software systems powered by deep learning (DL) techniques have significantly facilitated people's lives in many aspects. As the backbone of these DL systems, various DL libraries undertake the underlying optimization and…

Software Engineering · Computer Science 2025-02-06 Xiaoyu Zhang , Weipeng Jiang , Chao Shen , Qi Li , Qian Wang , Chenhao Lin , Xiaohong Guan

DiffSpec: Differential Testing with LLMs using Natural Language Specifications and Code Artifacts

Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, or language runtimes. Specifications for such…

Software Engineering · Computer Science 2025-05-07 Nikitha Rao , Elizabeth Gilbert , Harrison Green , Tahina Ramananandro , Nikhil Swamy , Claire Le Goues , Sarah Fakhoury

Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models

Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like buffer overflows and use-after-free errors. Traditional fuzzing struggles with the complexity and API diversity of DL libraries such as…

Software Engineering · Computer Science 2025-01-09 Kunpeng Zhang , Shuai Wang , Jitao Han , Xiaogang Zhu , Xian Li , Shaohua Wang , Sheng Wen

Checker Bug Detection and Repair in Deep Learning Libraries

Checker bugs in Deep Learning (DL) libraries are critical yet not well-explored. These bugs are often concealed in the input validation and error-checking code of DL libraries and can lead to silent failures, incorrect results, or…

Software Engineering · Computer Science 2024-10-10 Nima Shiri Harzevili , Mohammad Mahdi Mohajer , Jiho Shin , Moshi Wei , Gias Uddin , Jinqiu Yang , Junjie Wang , Song Wang , Zhen Ming , Jiang , Nachiappan Nagappan

Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors

LLMs are transforming software development, yet current code generation and code repair benchmarks mainly assess syntactic and functional correctness in simple, single-error cases. LLMs' capabilities to autonomously find and fix runtime…

Computation and Language · Computer Science 2025-09-17 Zhiyu Yang , Shuo Wang , Yukun Yan , Yang Deng

Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing

Deep learning (DL) techniques are proven effective in many challenging tasks, and become widely-adopted in practice. However, previous work has shown that DL libraries, the basis of building and executing DL models, contain bugs and can…

Software Engineering · Computer Science 2022-05-10 Jiazhen Gu , Xuchuan Luo , Yangfan Zhou , Xin Wang

LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer

Deep learning (DL) libraries are widely used in critical applications, where even subtle silent bugs can lead to serious consequences. While existing DL fuzzing techniques have made progress in detecting crashes, they inherently struggle to…

Software Engineering · Computer Science 2026-03-02 Kunpeng Zhang , Dongwei Xiao , Daoyuan Wu , Shuai Wang , Jiali Zhao , Yuanyi Lin , Tongtong Xu , Shaohua Wang

A Tale of Two DL Cities: When Library Tests Meet Compiler

Deep Learning (DL) compilers typically load a DL model and optimize it with intermediate representation.Existing DL compiler testing techniques mainly focus on model optimization stages, but rarely explore bug detection at the model loading…

Software Engineering · Computer Science 2024-08-15 Qingchao Shen , Yongqiang Tian , Haoyang Ma , Junjie Chen , Lili Huang , Ruifeng Fu , Shing-Chi Cheung , Zan Wang

CITADEL: Context Similarity Based Deep Learning Framework Bug Finding

With the application of deep learning technology, tools of DL framework testing are in high demand. Existing DL framework testing tools have limited coverage of bug types. For example, they lack the capability of effectively finding…

Software Engineering · Computer Science 2025-10-29 Xiaoyu Zhang , Juan Zhai , Shiqing Ma , Shiwei Wang , Chao Shen

MirrorFuzz: Leveraging LLM and Shared Bugs for Deep Learning Framework APIs Fuzzing

Deep learning (DL) frameworks serve as the backbone for a wide range of artificial intelligence applications. However, bugs within DL frameworks can cascade into critical issues in higher-level applications, jeopardizing reliability and…

Software Engineering · Computer Science 2025-10-20 Shiwen Ou , Yuwei Li , Lu Yu , Chengkun Wei , Tingke Wen , Qiangpu Chen , Yu Chen , Haizhi Tang , Zulie Pan

Understanding Performance Problems in Deep Learning Systems

Deep learning (DL) has been widely applied to many domains. Unique challenges in engineering DL systems are posed by the programming paradigm shift from traditional systems to DL systems, and performance is one of the challenges.…

Software Engineering · Computer Science 2022-11-01 Junming Cao , Bihuan Chen , Chao Sun , Longjie Hu , Shuaihong Wu , Xin Peng

Improving Deep Learning Library Testing with Machine Learning

Deep Learning (DL) libraries like TensorFlow and Pytorch simplify machine learning (ML) model development but are prone to bugs due to their complex design. Bug-finding techniques exist, but without precise API specifications, they produce…

Software Engineering · Computer Science 2026-02-04 Facundo Molina , M M Abid Naziri , Feiran Qin , Alessandra Gorla , Marcelo d'Amorim

Toward Understanding Deep Learning Framework Bugs

DL frameworks are the basis of constructing all DL programs and models, and thus their bugs could lead to the unexpected behaviors of any DL program or model relying on them. Such a wide effect demonstrates the necessity and importance of…

Software Engineering · Computer Science 2024-08-22 Junjie Chen , Yihua Liang , Qingchao Shen , Jiajun Jiang , Shuochuan Li

Finding Missed Code Size Optimizations in Compilers using LLMs

Compilers are complex, and significant effort has been expended on testing them. Techniques such as random program generation and differential testing have proved highly effective and have uncovered thousands of bugs in production…

Software Engineering · Computer Science 2025-01-03 Davide Italiano , Chris Cummins

Understanding LLM-Centric Challenges for Deep Learning Frameworks: An Empirical Analysis

Large language models (LLMs) have driven significant progress across a wide range of real-world applications. Realizing such models requires substantial system-level support. Deep learning (DL) frameworks provide this foundation by enabling…

Software Engineering · Computer Science 2025-08-19 Yanzhou Mu , Rong Wang , Juan Zhai , Chunrong Fang , Xiang Chen , Jiacong Wu , An Guo , Jiawei Shen , Bingzhuo Li , Zhenyu Chen

Testing Deep Learning Libraries via Neurosymbolic Constraint Learning

Deep Learning (DL) libraries (e.g., PyTorch) are popular in AI development. These libraries are complex and contain bugs. Researchers have proposed various bug-finding techniques for such libraries. Yet, there is much room for improvement.…

Software Engineering · Computer Science 2026-01-23 M M Abid Naziri , Shinhae Kim , Feiran Qin , Marcelo d'Amorim , Saikat Dutta

NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

Deep-learning (DL) compilers such as TVM and TensorRT are increasingly being used to optimize deep neural network (DNN) models to meet performance, resource utilization and other requirements. Bugs in these compilers can result in models…

Machine Learning · Computer Science 2023-01-02 Jiawei Liu , Jinkun Lin , Fabian Ruffy , Cheng Tan , Jinyang Li , Aurojit Panda , Lingming Zhang

Fuzzing Deep-Learning Libraries via Automated Relational API Inference

A growing body of research has been dedicated to DL model testing. However, there is still limited work on testing DL libraries, which serve as the foundations for building, training, and running DL models. Prior work on fuzzing DL…

Software Engineering · Computer Science 2022-07-13 Yinlin Deng , Chenyuan Yang , Anjiang Wei , Lingming Zhang

A Comprehensive Study of Bugs in Modern Distributed Deep Learning Systems

In today's data-driven era, deep learning is vital for processing massive datasets, yet single-device training is constrained by computational and memory limits. Distributed deep learning overcomes these challenges by leveraging multiple…

Software Engineering · Computer Science 2025-12-24 Xiaoxue Ma , Wanwei Zhan , Jiale Chen , Yishu Li , Jacky Keung , Federica Sarro