A Split-Window Transformer for Multi-Model Sequence Spammer Detection using Multi-Model Variational Autoencoder

Zhou Yang; Yucai Pang; Hongbo Yin; Yunpeng Xiao

A Split-Window Transformer for Multi-Model Sequence Spammer Detection using Multi-Model Variational Autoencoder

Machine Learning 2025-02-25 v1 Artificial Intelligence Multimedia Social and Information Networks

Authors: Zhou Yang , Yucai Pang , Hongbo Yin , Yunpeng Xiao

Abstract

This paper introduces a new Transformer, called MS $^2$ Dformer, that can be used as a generalized backbone for multi-modal sequence spammer detection. Spammer detection is a complex multi-modal task, thus the challenges of applying Transformer are two-fold. Firstly, complex multi-modal noisy information about users can interfere with feature mining. Secondly, the long sequence of users' historical behaviors also puts a huge GPU memory pressure on the attention computation. To solve these problems, we first design a user behavior Tokenization algorithm based on the multi-modal variational autoencoder (MVAE). Subsequently, a hierarchical split-window multi-head attention (SW/W-MHA) mechanism is proposed. The split-window strategy transforms the ultra-long sequences hierarchically into a combination of intra-window short-term and inter-window overall attention. Pre-trained on the public datasets, MS $^2$ Dformer's performance far exceeds the previous state of the art. The experiments demonstrate MS $^2$ Dformer's ability to act as a backbone.

Keywords

attention mechanism encoder-decoder architecture vision transformer

Cite

@article{arxiv.2502.16483,
  title  = {A Split-Window Transformer for Multi-Model Sequence Spammer Detection using Multi-Model Variational Autoencoder},
  author = {Zhou Yang and Yucai Pang and Hongbo Yin and Yunpeng Xiao},
  journal= {arXiv preprint arXiv:2502.16483},
  year   = {2025}
}

A Split-Window Transformer for Multi-Model Sequence Spammer Detection using Multi-Model Variational Autoencoder

Abstract

Keywords

Cite

Related papers