English

FuncFooler: A Practical Black-box Attack Against Learning-based Binary Code Similarity Detection Methods

Cryptography and Security 2022-08-31 v1 Artificial Intelligence Machine Learning

Abstract

The binary code similarity detection (BCSD) method measures the similarity of two binary executable codes. Recently, the learning-based BCSD methods have achieved great success, outperforming traditional BCSD in detection accuracy and efficiency. However, the existing studies are rather sparse on the adversarial vulnerability of the learning-based BCSD methods, which cause hazards in security-related applications. To evaluate the adversarial robustness, this paper designs an efficient and black-box adversarial code generation algorithm, namely, FuncFooler. FuncFooler constrains the adversarial codes 1) to keep unchanged the program's control flow graph (CFG), and 2) to preserve the same semantic meaning. Specifically, FuncFooler consecutively 1) determines vulnerable candidates in the malicious code, 2) chooses and inserts the adversarial instructions from the benign code, and 3) corrects the semantic side effect of the adversarial code to meet the constraints. Empirically, our FuncFooler can successfully attack the three learning-based BCSD models, including SAFE, Asm2Vec, and jTrans, which calls into question whether the learning-based BCSD is desirable.

Keywords

Cite

@article{arxiv.2208.14191,
  title  = {FuncFooler: A Practical Black-box Attack Against Learning-based Binary Code Similarity Detection Methods},
  author = {Lichen Jia and Bowen Tang and Chenggang Wu and Zhe Wang and Zihan Jiang and Yuanming Lai and Yan Kang and Ning Liu and Jingfeng Zhang},
  journal= {arXiv preprint arXiv:2208.14191},
  year   = {2022}
}

Comments

9 pages, 4 figures