Related papers: DocTer: Documentation Guided Fuzzing for Testing D…
Modern AI- and Data-intensive software systems rely heavily on data science and machine learning libraries that provide essential algorithmic implementations and computational frameworks. These libraries expose complex APIs whose correct…
Deep learning (DL) applications are prevalent nowadays as they can help with multiple tasks. DL libraries are essential for building DL applications. Furthermore, DL operators are the important building blocks of the DL libraries, that…
Deep Learning (DL) libraries like TensorFlow and Pytorch simplify machine learning (ML) model development but are prone to bugs due to their complex design. Bug-finding techniques exist, but without precise API specifications, they produce…
Containerization allows developers to define the execution environment in which their software needs to be installed. Docker is the leading platform in this field, and developers that use it are required to write a Dockerfile for their…
Comments within source code are essential for developers to comprehend the code's purpose and ensure its correct usage. However, as codebases evolve, maintaining an accurate alignment between the comments and the code becomes increasingly…
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like buffer overflows and use-after-free errors. Traditional fuzzing struggles with the complexity and API diversity of DL libraries such as…
Many automatic unit test generation tools that can generate unit test cases with high coverage over a program have been proposed. However, most of these tools are ineffective on deep learning (DL) frameworks due to the fact that many of…
A growing body of research has been dedicated to DL model testing. However, there is still limited work on testing DL libraries, which serve as the foundations for building, training, and running DL models. Prior work on fuzzing DL…
Differential testing offers a promising strategy to alleviate the test oracle problem by comparing the test results between alternative implementations. However, existing differential testing techniques for deep learning (DL) libraries are…
Recently, many Deep Learning fuzzers have been proposed for testing of DL libraries. However, they either perform unguided input generation (e.g., not considering the relationship between API arguments when generating inputs) or only…
Checker bugs in Deep Learning (DL) libraries are critical yet not well-explored. These bugs are often concealed in the input validation and error-checking code of DL libraries and can lead to silent failures, incorrect results, or…
Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical…
Deep learning powers critical applications such as autonomous driving, healthcare, and finance, where the correctness of underlying libraries is essential. Bugs in widely used deep learning APIs can propagate to downstream systems, causing…
Recent deep learning (DL) applications are mostly built on top of DL libraries. The quality assurance of these libraries is critical to the dependable deployment of DL applications. Techniques have been proposed to generate various DL…
Web APIs may have constraints on parameters, such that not all parameters are either always required or always optional. Moreover, the presence or value of one parameter could cause another parameter to be required, or parameters could have…
Deep learning (DL) has been widely applied to many domains. Unique challenges in engineering DL systems are posed by the programming paradigm shift from traditional systems to DL systems, and performance is one of the challenges.…
Data science libraries, such as scikit-learn and pandas, specialize in processing and manipulating data. The data-centric nature of these libraries makes the detection of API misuse in them more challenging. This paper introduces DSCHECKER,…
In this work, we set out to conduct the first ground-truth empirical evaluation of state-of-the-art DL fuzzers. Specifically, we first manually created an extensive DL bug benchmark dataset, which includes 627 real-world DL bugs from…
As machine learning gains prominence in various sectors of society for automated decision-making, concerns have risen regarding potential vulnerabilities in machine learning (ML) frameworks. Nevertheless, testing these frameworks is a…
Deep Learning (DL) libraries such as PyTorch provide the core components to build major AI-enabled applications. Finding bugs in these libraries is important and challenging. Prior approaches have tackled this by performing either API-level…