MMA-Diffusion: MultiModal Attack on Diffusion Models

Yijun Yang; Ruiyuan Gao; Xiaosen Wang; Tsung-Yi Ho; Nan Xu; Qiang Xu

MMA-Diffusion: MultiModal Attack on Diffusion Models

Cryptography and Security 2024-04-02 v4 Computer Vision and Pattern Recognition

Authors: Yijun Yang , Ruiyuan Gao , Xiaosen Wang , Tsung-Yi Ho , Nan Xu , Qiang Xu

Abstract

In recent years, Text-to-Image (T2I) models have seen remarkable advancements, gaining widespread adoption. However, this progress has inadvertently opened avenues for potential misuse, particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion, a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches, MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers, thus exposing and highlighting the vulnerabilities in existing defense mechanisms.

Keywords

adversarial attack

Cite

@article{arxiv.2311.17516,
  title  = {MMA-Diffusion: MultiModal Attack on Diffusion Models},
  author = {Yijun Yang and Ruiyuan Gao and Xiaosen Wang and Tsung-Yi Ho and Nan Xu and Qiang Xu},
  journal= {arXiv preprint arXiv:2311.17516},
  year   = {2024}
}

Comments

CVPR 2024. Our codes and benchmarks are available at https://github.com/cure-lab/MMA-Diffusion

MMA-Diffusion: MultiModal Attack on Diffusion Models

Abstract

Keywords

Cite

Comments

Related papers