English

Malware Task Identification: A Data Driven Approach

Cryptography and Security 2015-07-08 v1

Abstract

Identifying the tasks a given piece of malware was designed to perform (e.g. logging keystrokes, recording video, establishing remote access, etc.) is a difficult and time-consuming operation that is largely human-driven in practice. In this paper, we present an automated method to identify malware tasks. Using two different malware collections, we explore various circumstances for each - including cases where the training data differs significantly from test; where the malware being evaluated employs packing to thwart analytical techniques; and conditions with sparse training data. We find that this approach consistently out-performs the current state-of-the art software for malware task identification as well as standard machine learning approaches - often achieving an unbiased F1 score of over 0.9. In the near future, we look to deploy our approach for use by analysts in an operational cyber-security environment.

Keywords

Cite

@article{arxiv.1507.01930,
  title  = {Malware Task Identification: A Data Driven Approach},
  author = {Eric Nunes and Casey Buto and Paulo Shakarian and Christian Lebiere and Stefano Bennati and Robert Thomson and Holger Jaenisch},
  journal= {arXiv preprint arXiv:1507.01930},
  year   = {2015}
}

Comments

8 pages full paper, accepted FOSINT-SI (2015)

R2 v1 2026-06-22T10:07:32.896Z