English

Multi-View Pre-Trained Model for Code Vulnerability Identification

Software Engineering 2022-08-11 v1 Artificial Intelligence

Abstract

Vulnerability identification is crucial for cyber security in the software-related industry. Early identification methods require significant manual efforts in crafting features or annotating vulnerable code. Although the recent pre-trained models alleviate this issue, they overlook the multiple rich structural information contained in the code itself. In this paper, we propose a novel Multi-View Pre-Trained Model (MV-PTM) that encodes both sequential and multi-type structural information of the source code and uses contrastive learning to enhance code representations. The experiments conducted on two public datasets demonstrate the superiority of MV-PTM. In particular, MV-PTM improves GraphCodeBERT by 3.36\% on average in terms of F1 score.

Keywords

Cite

@article{arxiv.2208.05227,
  title  = {Multi-View Pre-Trained Model for Code Vulnerability Identification},
  author = {Xuxiang Jiang and Yinhao Xiao and Jun Wang and Wei Zhang},
  journal= {arXiv preprint arXiv:2208.05227},
  year   = {2022}
}

Comments

Accepted By WASA'2022