Related papers: Machine Learning Transferability for Malware Detec…

EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models

This paper describes EMBER: a labeled benchmark dataset for training machine learning models to statically detect malicious Windows portable executable files. The dataset includes features extracted from 1.1M binary files: 900K training…

Cryptography and Security · Computer Science 2018-04-18 Hyrum S. Anderson , Phil Roth

Towards an Automated Pipeline for Detecting and Classifying Malware through Machine Learning

The constant growth in the number of malware - software or code fragment potentially harmful for computers and information networks - and the use of sophisticated evasion and obfuscation techniques have seriously hindered classic…

Cryptography and Security · Computer Science 2021-06-11 Nicola Loi , Claudio Borile , Daniele Ucci

Multimodal Techniques for Malware Classification

The threat of malware is a serious concern for computer networks and systems, highlighting the need for accurate classification techniques. In this research, we experiment with multimodal machine learning approaches for malware…

Cryptography and Security · Computer Science 2025-01-22 Jonathan Jiang , Mark Stamp

Efficient Malware Analysis Using Metric Embeddings

In this paper, we explore the use of metric learning to embed Windows PE files in a low-dimensional vector space for downstream use in a variety of applications, including malware detection, family classification, and malware attribute…

Machine Learning · Computer Science 2022-12-07 Ethan M. Rudd , David Krisiloff , Scott Coull , Daniel Olszewski , Edward Raff , James Holt

Machine Learning for Detecting Malware in PE Files

The increasing number of sophisticated malware poses a major cybersecurity threat. Portable executable (PE) files are a common vector for such malware. In this work we review and evaluate machine learning-based PE malware detection…

Cryptography and Security · Computer Science 2022-12-29 Collin Connors , Dilip Sarkar

Evaluating Ensemble and Deep Learning Models for Static Malware Detection with Dimensionality Reduction Using the EMBER Dataset

This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset, which contains feature representations of Portable Executable (PE) files. We evaluate eight…

Cryptography and Security · Computer Science 2025-07-28 Md Min-Ha-Zul Abedin , Tazqia Mehrub

MERLIN -- Malware Evasion with Reinforcement LearnINg

In addition to signature-based and heuristics-based detection techniques, machine learning (ML) is widely used to generalize to new, never-before-seen malicious software (malware). However, it has been demonstrated that ML models can be…

Cryptography and Security · Computer Science 2022-03-31 Tony Quertier , Benjamin Marais , Stéphane Morucci , Bertrand Fournel

EMBERSim: A Large-Scale Databank for Boosting Similarity Search in Malware Analysis

In recent years there has been a shift from heuristics-based malware detection towards machine learning, which proves to be more robust in the current heavily adversarial threat landscape. While we acknowledge machine learning to be better…

Machine Learning · Computer Science 2023-10-04 Dragos Georgian Corlatescu , Alexandru Dinu , Mihaela Gaman , Paul Sumedrea

Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection

Recent work has shown that adversarial Windows malware samples - referred to as adversarial EXEmples in this paper - can bypass machine learning-based detection relying on static code analysis by perturbing relatively few input bytes. To…

Cryptography and Security · Computer Science 2021-06-29 Luca Demetrio , Scott E. Coull , Battista Biggio , Giovanni Lagorio , Alessandro Armando , Fabio Roli

Optimized Deep Learning Models for Malware Detection under Concept Drift

Despite the promising results of machine learning models in malicious files detection, they face the problem of concept drift due to their constant evolution. This leads to declining performance over time, as the data distribution of the…

Cryptography and Security · Computer Science 2024-08-02 William Maillet , Benjamin Marais

Enhancing Decision-Making in Windows PE Malware Classification During Dataset Shifts with Uncertainty Estimation

Artificial intelligence techniques have achieved strong performance in classifying Windows Portable Executable (PE) malware, but their reliability often degrades under dataset shifts, leading to misclassifications with severe security…

Cryptography and Security · Computer Science 2025-12-23 Rahul Yumlembam , Biju Issac , Seibu Mary Jacob

EMBER2024 -- A Benchmark Dataset for Holistic Evaluation of Malware Classifiers

A lack of accessible data has historically restricted malware analysis research, and practitioners have relied heavily on datasets provided by industry sources to advance. Existing public datasets are limited by narrow scope - most include…

Cryptography and Security · Computer Science 2025-06-06 Robert J. Joyce , Gideon Miller , Phil Roth , Richard Zak , Elliott Zaresky-Williams , Hyrum Anderson , Edward Raff , James Holt

Predicting Vulnerability to Malware Using Machine Learning Models: A Study on Microsoft Windows Machines

In an era of escalating cyber threats, malware poses significant risks to individuals and organizations, potentially leading to data breaches, system failures, and substantial financial losses. This study addresses the urgent need for…

Cryptography and Security · Computer Science 2025-01-28 Marzieh Esnaashari , Nima Moradi

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Driven by the high profit, Portable Executable (PE) malware has been consistently evolving in terms of both volume and sophistication. PE malware family classification has gained great attention and a large number of approaches have been…

Cryptography and Security · Computer Science 2021-11-01 Yixuan Ma , Shuang Liu , Jiajun Jiang , Guanhong Chen , Keqiu Li

Explainable Malware Detection with Tailored Logic Explained Networks

Malware detection is a constant challenge in cybersecurity due to the rapid development of new attack techniques. Traditional signature-based approaches struggle to keep pace with the sheer volume of malware samples. Machine learning offers…

Cryptography and Security · Computer Science 2024-05-07 Peter Anthony , Francesco Giannini , Michelangelo Diligenti , Martin Homola , Marco Gori , Stefan Balogh , Jan Mojzis

MalwarePT: A Binary-Level Foundation Model for Malware Analysis

Automated malware analysis increasingly relies on machine learning, yet most existing methods remain task-specific and depend on handcrafted features or narrowly scoped models. Recent developments in binary-level foundation models suggest a…

Cryptography and Security · Computer Science 2026-05-19 Saastha Vasan , Yuzhou Nie , Kaie Chen , Yigitcan Kaya , Hojjat Aghakhani , Roman Vasilenko , Wenbo Guo , Christopher Kruegel , Giovanni Vigna

Semantic Data Representation for Explainable Windows Malware Detection Models

Ontologies are a standard tool for creating semantic schemata in many knowledge intensive domains of human interest. They are becoming increasingly important also in the areas that have been until very recently dominated by subsymbolic…

Cryptography and Security · Computer Science 2025-09-26 Peter Švec , Štefan Balogh , Martin Homola , Ján Kľuka , Tomáš Bisták , Peter Anthony

Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables

Machine-learning methods have already been exploited as useful tools for detecting malicious executable files. They leverage data retrieved from malware samples, such as header fields, instruction sequences, or even raw bytes, to learn…

Cryptography and Security · Computer Science 2018-03-13 Bojan Kolosnjaji , Ambra Demontis , Battista Biggio , Davide Maiorca , Giorgio Giacinto , Claudia Eckert , Fabio Roli

Multi-feature Dataset for Windows PE Malware Classification

This paper describes a multi-feature dataset for training machine learning classifiers for detecting malicious Windows Portable Executable (PE) files. The dataset includes four feature sets from 18,551 binary samples belonging to five…

Cryptography and Security · Computer Science 2022-10-31 Muhammad Irfan Yousuf , Izza Anwer , Tanzeela Shakir , Minahil Siddiqui , Maysoon Shahid

Foundational Models for Malware Embeddings Using Spatio-Temporal Parallel Convolutional Networks

In today's interconnected digital landscape, the proliferation of malware poses a significant threat to the security and stability of computer networks and systems worldwide. As the complexity of malicious tactics, techniques, and…

Cryptography and Security · Computer Science 2023-05-26 Dhruv Nandakumar , Devin Quinn , Elijah Soba , Eunyoung Kim , Christopher Redino , Chris Chan , Kevin Choi , Abdul Rahman , Edward Bowen