Partial train and isolate, mitigate backdoor attack

Yong Li; Han Gao

Partial train and isolate, mitigate backdoor attack

Computer Vision and Pattern Recognition 2024-06-07 v2

Authors: Yong Li , Han Gao

Abstract

Neural networks are widely known to be vulnerable to backdoor attacks, a method that poisons a portion of the training data to make the target model perform well on normal data sets, while outputting attacker-specified or random categories on the poisoned samples. Backdoor attacks are full of threats. Poisoned samples are becoming more and more similar to corresponding normal samples, and even the human eye cannot easily distinguish them. On the other hand, the accuracy of models carrying backdoors on normal samples is no different from that of clean models.In this article, by observing the characteristics of backdoor attacks, We provide a new model training method (PT) that freezes part of the model to train a model that can isolate suspicious samples. Then, on this basis, a clean model is fine-tuned to resist backdoor attacks.

Keywords

backdoor attack data poisoning attack privacy attacks on machine learning

Cite

@article{arxiv.2405.16488,
  title  = {Partial train and isolate, mitigate backdoor attack},
  author = {Yong Li and Han Gao},
  journal= {arXiv preprint arXiv:2405.16488},
  year   = {2024}
}

Comments

9 pages, 2 figures

Partial train and isolate, mitigate backdoor attack

Abstract

Keywords

Cite

Comments

Related papers