Related papers: CodeImprove: Program Adaptation for Deep Code Mode…

On-the-fly Improving Performance of Deep Code Models via Input Denoising

Deep learning has been widely adopted to tackle various code-based tasks by building deep code models based on a large amount of code snippets. While these deep code models have achieved great success, even state-of-the-art models suffer…

Software Engineering · Computer Science 2023-08-22 Zhao Tian , Junjie Chen , Xiangyu Zhang

Test Input Validation for Vision-based DL Systems: An Active Learning Approach

Testing deep learning (DL) systems requires extensive and diverse, yet valid, test inputs. While synthetic test input generation methods, such as metamorphic testing, are widely used for DL testing, they risk introducing invalid inputs that…

Software Engineering · Computer Science 2025-01-06 Delaram Ghobari , Mohammad Hossein Amini , Dai Quoc Tran , Seunghee Park , Shiva Nejati , Mehrdad Sabetzadeh

Framework for On the Fly Input Refinement for Deep Learning Models

Advancements in deep learning have significantly improved model performance across tasks involving code, text, and image processing. However, these models still exhibit notable mispredictions in real-world applications, even when trained on…

Software Engineering · Computer Science 2025-06-25 Ravishka Rathnasuriya

On-the-Fly Input Adaptation for Reliable Code Intelligence

Code language models (CLMs) play a central role in software engineering across both generation and classification tasks. However, these models still exhibit notable mispredictions in real-world applications, even when trained on up-to-date…

Software Engineering · Computer Science 2026-05-20 Ravishka Rathnasuriya , Wei Yang

Learning Performance-Improving Code Edits

With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the…

Software Engineering · Computer Science 2024-04-29 Alexander Shypula , Aman Madaan , Yimeng Zeng , Uri Alon , Jacob Gardner , Milad Hashemi , Graham Neubig , Parthasarathy Ranganathan , Osbert Bastani , Amir Yazdanbakhsh

DeepCodeProbe: Towards Understanding What Models Trained on Code Learn

Machine learning models trained on code and related artifacts offer valuable support for software maintenance but suffer from interpretability issues due to their complex internal variables. These concerns are particularly significant in…

Software Engineering · Computer Science 2024-07-15 Vahid Majdinasab , Amin Nikanjam , Foutse Khomh

D-LiFT: Improving LLM-based Decompiler Backend via Code Quality-driven Fine-tuning

As one of the key tools in many security tasks, decompilers reconstruct human-readable source code from binaries. Yet, despite recent advances, their outputs often suffer from syntactic and semantic errors and remain difficult to read.…

Cryptography and Security · Computer Science 2025-08-19 Muqi Zou , Hongyu Cai , Hongwei Wu , Zion Leonahenahe Basque , Arslan Khan , Berkay Celik , Dave , Tian , Antonio Bianchi , Ruoyu , Wang , Dongyan Xu

CORE: Automating Review Recommendation for Code Changes

Code review is a common process that is used by developers, in which a reviewer provides useful comments or points out defects in the submitted source code changes via pull request. Code review has been widely used for both industry and…

Software Engineering · Computer Science 2019-12-23 JingKai Siow , Cuiyun Gao , Lingling Fan , Sen Chen , Yang Liu

An Effective Approach to Embedding Source Code by Combining Large Language and Sentence Embedding Models

The advent of large language models (LLMs) has significantly advanced artificial intelligence (AI) in software engineering (SE), with source code embeddings playing a crucial role in tasks such as source code clone detection and source code…

Software Engineering · Computer Science 2025-06-04 Zixiang Xian , Chenhui Cui , Rubing Huang , Chunrong Fang , Zhenyu Chen

The Code Barrier: What LLMs Actually Understand?

Understanding code represents a core ability needed for automating software development tasks. While foundation models like LLMs show impressive results across many software engineering challenges, the extent of their true semantic…

Software Engineering · Computer Science 2025-04-16 Serge Lionel Nikiema , Jordan Samhi , Abdoul Kader Kaboré , Jacques Klein , Tegawendé F. Bissyandé

Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders

When evaluating the performance of a pre-trained model transferred to a downstream task, it is imperative to assess not only the in-distribution (ID) accuracy of the downstream model but also its capacity to generalize and identify…

Machine Learning · Computer Science 2024-04-02 Andrew Geng , Pin-Yu Chen

Test Selection for Deep Learning Systems

Testing of deep learning models is challenging due to the excessive number and complexity of computations involved. As a result, test data selection is performed manually and in an ad hoc way. This raises the question of how we can…

Machine Learning · Computer Science 2019-05-01 Wei Ma , Mike Papadakis , Anestis Tsakmalis , Maxime Cordy , Yves Le Traon

Data-Driven AI Model Signal-Awareness Enhancement and Introspection

AI modeling for source code understanding tasks has been making significant progress, and is being adopted in production development pipelines. However, reliability concerns, especially whether the models are actually learning task-related…

Software Engineering · Computer Science 2022-01-11 Sahil Suneja , Yufan Zhuang , Yunhui Zheng , Jim Laredo , Alessandro Morari

On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code

The advent of instruction-tuned Large Language Models designed for coding tasks (Code LLMs) has transformed software engineering practices. However, their robustness against various input challenges remains a critical concern. This study…

Software Engineering · Computer Science 2024-12-02 Md Imran Hossen , Xiali Hei

AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations

In software development, it is common for programmers to copy-paste or port code snippets and then adapt them to their use case. This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable…

Software Engineering · Computer Science 2023-10-09 Xiaoyu Liu , Jinu Jang , Neel Sundaresan , Miltiadis Allamanis , Alexey Svyatkovskiy

CodeDSI: Differentiable Code Search

Reimplementing solutions to previously solved software engineering problems is not only inefficient but also introduces inadequate and error-prone code. Many existing methods achieve impressive performance on this issue by using…

Software Engineering · Computer Science 2022-10-04 Usama Nadeem , Noah Ziems , Shaoen Wu

Towards Automating Code Review Activities

Code reviews are popular in both industrial and open source projects. The benefits of code reviews are widely recognized and include better code quality and lower likelihood of introducing bugs. However, since code review is a manual…

Software Engineering · Computer Science 2021-05-20 Rosalia Tufano , Luca Pascarella , Michele Tufano , Denys Poshyvanyk , Gabriele Bavota

Code2Image: Intelligent Code Analysis by Computer Vision Techniques and Application to Vulnerability Prediction

Intelligent code analysis has received increasing attention in parallel with the remarkable advances in the field of machine learning (ML) in recent years. A major challenge in leveraging ML for this purpose is to represent source code in a…

Software Engineering · Computer Science 2021-05-10 Zeki Bilgin

Frustrated with Code Quality Issues? LLMs can Help!

As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality…

Artificial Intelligence · Computer Science 2023-09-25 Nalin Wadhwa , Jui Pradhan , Atharv Sonwane , Surya Prakash Sahu , Nagarajan Natarajan , Aditya Kanade , Suresh Parthasarathy , Sriram Rajamani

Metamorphic Testing of Deep Code Models: A Systematic Literature Review

Large language models and deep learning models designed for code intelligence have revolutionized the software engineering field due to their ability to perform various code-related tasks. These models can process source code and software…

Software Engineering · Computer Science 2025-07-31 Ali Asgari , Milan de Koning , Pouria Derakhshanfar , Annibale Panichella