English

Binary VPN Traffic Detection Using Wavelet Features and Machine Learning

Networking and Internet Architecture 2025-08-14 v3

Abstract

Encrypted traffic classification faces growing challenges as encryption renders traditional deep packet inspection ineffective. This study addresses binary VPN detection, distinguishing VPN-encrypted from non-VPN traffic using wavelet transform-based features across multiple machine learning models. Unlike previous studies focused on application-level classification within encrypted traffic, we specifically evaluate the fundamental task of VPN identification regardless of application type. We analyze the impact of wavelet decomposition levels and dataset filtering on classification performance across significantly imbalanced data, where filtering reduces some traffic categories by up to 95%. Our results demonstrate that Random Forest (RF) achieves superior performance with an F1-score of 99%, maintaining robust accuracy even after significant dataset filtering. Neural Networks (NN) show comparable effectiveness with an F1-score of 98% when trained on wavelet level 12, while Support Vector Machines (SVM) exhibit notable sensitivity to dataset reduction, with F1-scores dropping from 90% to 85% after filtering. Comparing wavelet decomposition at levels 5 and 12, we observe improved classification performance at level 12, particularly for variable traffic types, though the marginal gains may not justify the additional computational overhead. These findings establish RF as the most reliable model for VPN traffic classification while highlighting key performance tradeoffs in feature extraction and preprocessing.

Keywords

Cite

@article{arxiv.2502.13804,
  title  = {Binary VPN Traffic Detection Using Wavelet Features and Machine Learning},
  author = {Yasameen Sajid Razooqi and Adrian Pekar},
  journal= {arXiv preprint arXiv:2502.13804},
  year   = {2025}
}

Comments

Accepted for presentation at SoftCOM 2025

R2 v1 2026-06-28T21:50:11.471Z