Cyber Knowledge Completion Using Large Language Models

Braden K Webb; Sumit Purohit; Rounak Meyur

Cyber Knowledge Completion Using Large Language Models

Cryptography and Security 2024-09-25 v1 Artificial Intelligence

Authors: Braden K Webb , Sumit Purohit , Rounak Meyur

Abstract

The integration of the Internet of Things (IoT) into Cyber-Physical Systems (CPSs) has expanded their cyber-attack surface, introducing new and sophisticated threats with potential to exploit emerging vulnerabilities. Assessing the risks of CPSs is increasingly difficult due to incomplete and outdated cybersecurity knowledge. This highlights the urgent need for better-informed risk assessments and mitigation strategies. While previous efforts have relied on rule-based natural language processing (NLP) tools to map vulnerabilities, weaknesses, and attack patterns, recent advancements in Large Language Models (LLMs) present a unique opportunity to enhance cyber-attack knowledge completion through improved reasoning, inference, and summarization capabilities. We apply embedding models to encapsulate information on attack patterns and adversarial techniques, generating mappings between them using vector embeddings. Additionally, we propose a Retrieval-Augmented Generation (RAG)-based approach that leverages pre-trained models to create structured mappings between different taxonomies of threat patterns. Further, we use a small hand-labeled dataset to compare the proposed RAG-based approach to a baseline standard binary classification model. Thus, the proposed approach provides a comprehensive framework to address the challenge of cyber-attack knowledge graph completion.

Keywords

vulnerability detection knowledge graph large language model

Cite

@article{arxiv.2409.16176,
  title  = {Cyber Knowledge Completion Using Large Language Models},
  author = {Braden K Webb and Sumit Purohit and Rounak Meyur},
  journal= {arXiv preprint arXiv:2409.16176},
  year   = {2024}
}

Comments

7 pages, 2 figures. Submitted to 2024 IEEE International Conference on Big Data

Cyber Knowledge Completion Using Large Language Models

Abstract

Keywords

Cite

Comments

Related papers