English

EfficientIML: Efficient High-Resolution Image Manipulation Localization

Computer Vision and Pattern Recognition 2025-09-11 v1

Abstract

With imaging devices delivering ever-higher resolutions and the emerging diffusion-based forgery methods, current detectors trained only on traditional datasets (with splicing, copy-moving and object removal forgeries) lack exposure to this new manipulation type. To address this, we propose a novel high-resolution SIF dataset of 1200+ diffusion-generated manipulations with semantically extracted masks. However, this also imposes a challenge on existing methods, as they face significant computational resource constraints due to their prohibitive computational complexities. Therefore, we propose a novel EfficientIML model with a lightweight, three-stage EfficientRWKV backbone. EfficientRWKV's hybrid state-space and attention network captures global context and local details in parallel, while a multi-scale supervision strategy enforces consistency across hierarchical predictions. Extensive evaluations on our dataset and standard benchmarks demonstrate that our approach outperforms ViT-based and other SOTA lightweight baselines in localization performance, FLOPs and inference speed, underscoring its suitability for real-time forensic applications.

Keywords

Cite

@article{arxiv.2509.08583,
  title  = {EfficientIML: Efficient High-Resolution Image Manipulation Localization},
  author = {Jinhan Li and Haoyang He and Lei Xie and Jiangning Zhang},
  journal= {arXiv preprint arXiv:2509.08583},
  year   = {2025}
}
R2 v1 2026-07-01T05:30:03.998Z