Related papers: CodeAttack: Code-Based Adversarial Attacks for Pre…

Adversarial Attacks on Code Models with Discriminative Graph Patterns

Pre-trained language models of code are now widely used in various software engineering tasks such as code generation, code completion, vulnerability detection, etc. This, in turn, poses security and reliability risks to these models. One…

Software Engineering · Computer Science 2024-11-01 Thanh-Dat Nguyen , Yang Zhou , Xuan Bach D. Le , Patanamon Thongtanunam , David Lo

Natural Attack for Pre-trained Models of Code

Pre-trained models of code have achieved success in many important software engineering tasks. However, these powerful models are vulnerable to adversarial attacks that slightly perturb model inputs to make a victim model produce wrong…

Software Engineering · Computer Science 2022-03-01 Zhou Yang , Jieke Shi , Junda He , David Lo

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently shown to transfer well to Programming Languages (PL) and largely benefit a broad set of code-related tasks. Despite their success, most current methods…

Computation and Language · Computer Science 2021-09-03 Yue Wang , Weishi Wang , Shafiq Joty , Steven C. H. Hoi

Transfer Attacks and Defenses for Large Language Models on Coding Tasks

Modern large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities for coding tasks including writing and reasoning about code. They improve upon previous neural network models of code, such as code2seq or…

Machine Learning · Computer Science 2023-11-23 Chi Zhang , Zifan Wang , Ravi Mangal , Matt Fredrikson , Limin Jia , Corina Pasareanu

CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

The rapid advancement of Large Language Models (LLMs) has brought about remarkable generative capabilities but also raised concerns about their potential misuse. While strategies like supervised fine-tuning and reinforcement learning from…

Computation and Language · Computer Science 2024-09-17 Qibing Ren , Chang Gao , Jing Shao , Junchi Yan , Xin Tan , Wai Lam , Lizhuang Ma

An Extensive Study on Adversarial Attack against Pre-trained Models of Code

Transformer-based pre-trained models of code (PTMC) have been widely utilized and have achieved state-of-the-art performance in many mission-critical applications. However, they can be vulnerable to adversarial attacks through identifier…

Cryptography and Security · Computer Science 2023-11-27 Xiaohu Du , Ming Wen , Zichao Wei , Shangwen Wang , Hai Jin

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code…

Computation and Language · Computer Science 2020-09-21 Zhangyin Feng , Daya Guo , Duyu Tang , Nan Duan , Xiaocheng Feng , Ming Gong , Linjun Shou , Bing Qin , Ting Liu , Daxin Jiang , Ming Zhou

Stealthy Backdoor Attack for Code Models

Code models, such as CodeBERT and CodeT5, offer general-purpose representations of code and play a vital role in supporting downstream automated software engineering tasks. Most recently, code models were revealed to be vulnerable to…

Cryptography and Security · Computer Science 2023-08-30 Zhou Yang , Bowen Xu , Jie M. Zhang , Hong Jin Kang , Jieke Shi , Junda He , David Lo

Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching

Software vulnerabilities pose critical security risks, demanding prompt and effective mitigation strategies. While advancements in Automated Program Repair (APR) have primarily targeted general software bugs, the domain of vulnerability…

Software Engineering · Computer Science 2025-01-14 Zanis Ali Khan , Aayush Garg , Yuejun Guo , Qiang Tang

Code Vulnerability Detection Across Different Programming Languages with AI Models

Security vulnerabilities present in a code that has been written in diverse programming languages are among the most critical yet complicated aspects of source code to detect. Static analysis tools based on rule-based patterns usually do…

Cryptography and Security · Computer Science 2025-08-19 Hael Abdulhakim Ali Humran , Ferdi Sonmez

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing benchmarks often lack coverage for subtle corner cases, allowing incorrect solutions to pass.…

Software Engineering · Computer Science 2026-02-25 Jingwei Shi , Xinxiang Yin , Jing Huang , Jinman Zhao , Shengyu Tao

ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning

Large-scale pre-trained models such as CodeBERT, GraphCodeBERT have earned widespread attention from both academia and industry. Attributed to the superior ability in code representation, they have been further applied in multiple…

Software Engineering · Computer Science 2023-01-24 Shangqing Liu , Bozhi Wu , Xiaofei Xie , Guozhu Meng , Yang Liu

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

The prompt-based learning paradigm, which bridges the gap between pre-training and fine-tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot settings. Despite being widely applied, prompt-based…

Computation and Language · Computer Science 2024-02-05 Shuai Zhao , Jinming Wen , Luu Anh Tuan , Junbo Zhao , Jie Fu

Studying Vulnerable Code Entities in R

Pre-trained Code Language Models (Code-PLMs) have shown many advancements and achieved state-of-the-art results for many software engineering tasks in the past few years. These models are mainly targeted for popular programming languages…

Software Engineering · Computer Science 2024-02-08 Zixiao Zhao , Millon Madhur Das , Fatemeh H. Fard

BeamAttack: Generating High-quality Textual Adversarial Examples through Beam Search and Mixed Semantic Spaces

Natural language processing models based on neural networks are vulnerable to adversarial examples. These adversarial examples are imperceptible to human readers but can mislead models to make the wrong predictions. In a black-box setting,…

Computation and Language · Computer Science 2023-03-14 Hai Zhu , Qingyang Zhao , Yuren Wu

TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP

While there has been substantial research using adversarial attacks to analyze NLP models, each attack is implemented in its own code repository. It remains challenging to develop NLP attacks and utilize them to improve model performance.…

Computation and Language · Computer Science 2020-10-06 John X. Morris , Eli Lifland , Jin Yong Yoo , Jake Grigsby , Di Jin , Yanjun Qi

OpenAttack: An Open-source Textual Adversarial Attack Toolkit

Textual adversarial attacking has received wide and increasing attention in recent years. Various attack models have been proposed, which are enormously distinct and implemented with different programming frameworks and settings. These…

Computation and Language · Computer Science 2021-09-27 Guoyang Zeng , Fanchao Qi , Qianrui Zhou , Tingji Zhang , Zixian Ma , Bairu Hou , Yuan Zang , Zhiyuan Liu , Maosong Sun

CodeBERT-nt: code naturalness via CodeBERT

Much of software-engineering research relies on the naturalness of code, the fact that code, in small code snippets, is repetitive and can be predicted using statistical language models like n-gram. Although powerful, training such models…

Software Engineering · Computer Science 2022-08-15 Ahmed Khanfir , Matthieu Jimenez , Mike Papadakis , Yves Le Traon

Protecting Feed-Forward Networks from Adversarial Attacks Using Predictive Coding

An adversarial example is a modified input image designed to cause a Machine Learning (ML) model to make a mistake; these perturbations are often invisible or subtle to human observers and highlight vulnerabilities in a model's ability to…

Cryptography and Security · Computer Science 2024-11-04 Ehsan Ganjidoost , Jeff Orchard

Multi-target Backdoor Attacks for Code Pre-trained Models

Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting…

Cryptography and Security · Computer Science 2023-06-16 Yanzhou Li , Shangqing Liu , Kangjie Chen , Xiaofei Xie , Tianwei Zhang , Yang Liu