English

Static analysis of executable files by machine learning methods

Cryptography and Security 2020-07-16 v1 Machine Learning Machine Learning

Abstract

The paper describes how to detect malicious executable files based on static analysis of their binary content. The stages of pre-processing and cleaning data extracted from different areas of executable files are analyzed. Methods of encoding categorical attributes of executable files are considered, as are ways to reduce the feature field dimension and select characteristic features in order to effectively represent samples of binary executable files for further training classifiers. An ensemble training approach was applied in order to aggregate forecasts from each classifier, and an ensemble of classifiers of various feature groups of executable file attributes was created in order to subsequently develop a system for detecting malicious files in an uninsulated environment.

Keywords

Cite

@article{arxiv.2007.07501,
  title  = {Static analysis of executable files by machine learning methods},
  author = {Nikolay Prudkovskiy},
  journal= {arXiv preprint arXiv:2007.07501},
  year   = {2020}
}

Comments

36 pages, 13 figures, 6 tables