English

Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation

Artificial Intelligence 2021-09-09 v1 Software Engineering

Abstract

Identifying vulnerable code is a precautionary measure to counter software security breaches. Tedious expert effort has been spent to build static analyzers, yet insecure patterns are barely fully enumerated. This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program, in order to improve prediction performance. Compared with a generic GNN, our enhancements include a synthesis of multiple representations learned from the several parsed graphs of a program, and a new training loss metric that leverages the fine granularity of labeling. Our model outperforms multiple text, image and graph-based approaches, across two real-world datasets.

Keywords

Cite

@article{arxiv.2109.03341,
  title  = {Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation},
  author = {Yufan Zhuang and Sahil Suneja and Veronika Thost and Giacomo Domeniconi and Alessandro Morari and Jim Laredo},
  journal= {arXiv preprint arXiv:2109.03341},
  year   = {2021}
}

Comments

Submitted June 2020