English

OpCode-Based Malware Classification Using Machine Learning and Deep Learning Techniques

Cryptography and Security 2025-04-21 v1 Machine Learning

Abstract

This technical report presents a comprehensive analysis of malware classification using OpCode sequences. Two distinct approaches are evaluated: traditional machine learning using n-gram analysis with Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree classifiers; and a deep learning approach employing a Convolutional Neural Network (CNN). The traditional machine learning approach establishes a baseline using handcrafted 1-gram and 2-gram features from disassembled malware samples. The deep learning methodology builds upon the work proposed in "Deep Android Malware Detection" by McLaughlin et al. and evaluates the performance of a CNN model trained to automatically extract features from raw OpCode data. Empirical results are compared using standard performance metrics (accuracy, precision, recall, and F1-score). While the SVM classifier outperforms other traditional techniques, the CNN model demonstrates competitive performance with the added benefit of automated feature extraction.

Keywords

Cite

@article{arxiv.2504.13408,
  title  = {OpCode-Based Malware Classification Using Machine Learning and Deep Learning Techniques},
  author = {Varij Saini and Rudraksh Gupta and Neel Soni},
  journal= {arXiv preprint arXiv:2504.13408},
  year   = {2025}
}

Comments

11 pages