Related papers: Unlearning Backdoor Attacks through Gradient-Based…

Mitigating Backdoor Attacks using Activation-Guided Model Editing

Backdoor attacks compromise the integrity and reliability of machine learning models by embedding a hidden trigger during the training process, which can later be activated to cause unintended misbehavior. We propose a novel backdoor…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Felix Hsieh , Huy H. Nguyen , AprilPyone MaungMaung , Dmitrii Usynin , Isao Echizen

Backdoor Mitigation via Invertible Pruning Masks

Model pruning has gained traction as a promising defense strategy against backdoor attacks in deep learning. However, existing pruning-based approaches often fall short in accurately identifying and removing the specific parameters…

Computer Vision and Pattern Recognition · Computer Science 2025-10-16 Kealan Dunnett , Reza Arablouei , Dimity Miller , Volkan Dedeoglu , Raja Jurdak

Pruning Strategies for Backdoor Defense in LLMs

Backdoor attacks are a significant threat to the performance and integrity of pre-trained language models. Although such models are routinely fine-tuned for downstream NLP tasks, recent work shows they remain vulnerable to backdoor attacks…

Machine Learning · Computer Science 2025-08-28 Santosh Chapagain , Shah Muhammad Hamdi , Soukaina Filali Boubrahimi

Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models

Transfer learning provides an effective solution for feasibly and fast customize accurate \textit{Student} models, by transferring the learned knowledge of pre-trained \textit{Teacher} models over large datasets via fine-tuning. Many…

Machine Learning · Computer Science 2020-08-11 Shuo Wang , Surya Nepal , Carsten Rudolph , Marthie Grobler , Shangyu Chen , Tianle Chen

Exploiting Machine Unlearning for Backdoor Attacks in Deep Learning System

In recent years, the security issues of artificial intelligence have become increasingly prominent due to the rapid development of deep learning research and applications. Backdoor attack is an attack targeting the vulnerability of deep…

Cryptography and Security · Computer Science 2023-12-14 Peixin Zhang , Jun Sun , Mingtian Tan , Xinyu Wang

When Forgetting Triggers Backdoors: A Clean Unlearning Attack

Machine unlearning has emerged as a key component in ensuring ``Right to be Forgotten'', enabling the removal of specific data points from trained models. However, even when the unlearning is performed without poisoning the forget-set…

Cryptography and Security · Computer Science 2025-06-17 Marco Arazzi , Antonino Nocera , Vinod P

Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced…

Cryptography and Security · Computer Science 2018-06-01 Kang Liu , Brendan Dolan-Gavitt , Siddharth Garg

Mitigating Backdoor Attacks in Federated Learning

Malicious clients can attack federated learning systems using malicious data, including backdoor samples, during the training phase. The compromised global model will perform well on the validation dataset designed for the task, but a small…

Cryptography and Security · Computer Science 2021-01-18 Chen Wu , Xian Yang , Sencun Zhu , Prasenjit Mitra

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

The widespread adoption of deep learning across various industries has introduced substantial challenges, particularly in terms of model explainability and security. The inherent complexity of deep learning models, while contributing to…

Cryptography and Security · Computer Science 2025-01-08 Kealan Dunnett , Reza Arablouei , Dimity Miller , Volkan Dedeoglu , Raja Jurdak

IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks

Backdoor attacks are an insidious security threat against machine learning models. Adversaries can manipulate the predictions of compromised models by inserting triggers into the training phase. Various backdoor attacks have been devised…

Computation and Language · Computer Science 2023-05-29 Xuanli He , Jun Wang , Benjamin Rubinstein , Trevor Cohn

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

Backdoor attacks pose a persistent security risk to deep neural networks (DNNs) due to their stealth and durability. While recent research has explored leveraging model unlearning mechanisms to enhance backdoor concealment, existing attack…

Cryptography and Security · Computer Science 2025-10-16 Baogang Song , Dongdong Zhao , Jianwen Xiang , Qiben Xu , Zizhuo Yu

Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

In the context of machine unlearning, the primary challenge lies in effectively removing traces of private data from trained models while maintaining model performance and security against privacy attacks like membership inference attacks.…

Machine Learning · Computer Science 2024-06-26 Tao Huang , Ziyang Chen , Jiayang Meng , Qingyu Huang , Xu Yang , Xun Yi , Ibrahim Khalil

Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models

Supervised fine-tuning has become the predominant method for adapting large pretrained models to downstream tasks. However, recent studies have revealed that these models are vulnerable to backdoor attacks, where even a small number of…

Cryptography and Security · Computer Science 2025-01-08 Peihai Jiang , Xixiang Lyu , Yige Li , Jing Ma

Fine-Tuning Is All You Need to Mitigate Backdoor Attacks

Backdoor attacks represent one of the major threats to machine learning models. Various efforts have been made to mitigate backdoors. However, existing defenses have become increasingly complex and often require high computational resources…

Cryptography and Security · Computer Science 2022-12-20 Zeyang Sha , Xinlei He , Pascal Berrang , Mathias Humbert , Yang Zhang

Variance-Based Defense Against Blended Backdoor Attacks

Backdoor attacks represent a subtle yet effective class of cyberattacks targeting AI models, primarily due to their stealthy nature. The model behaves normally on clean data but exhibits malicious behavior only when the attacker embeds a…

Machine Learning · Computer Science 2025-09-29 Sujeevan Aseervatham , Achraf Kerzazi , Younès Bennani

Reconstructive Neuron Pruning for Backdoor Defense

Deep neural networks (DNNs) have been found to be vulnerable to backdoor attacks, raising security concerns about their deployment in mission-critical applications. While existing defense methods have demonstrated promising results, it is…

Machine Learning · Computer Science 2023-12-11 Yige Li , Xixiang Lyu , Xingjun Ma , Nodens Koren , Lingjuan Lyu , Bo Li , Yu-Gang Jiang

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications…

Cryptography and Security · Computer Science 2018-08-31 Cong Liao , Haoti Zhong , Anna Squicciarini , Sencun Zhu , David Miller

Backdoor Defense with Machine Unlearning

Backdoor injection attack is an emerging threat to the security of neural networks, however, there still exist limited effective defense methods against the attack. In this paper, we propose BAERASE, a novel method that can erase the…

Cryptography and Security · Computer Science 2022-01-25 Yang Liu , Mingyuan Fan , Cen Chen , Ximeng Liu , Zhuo Ma , Li Wang , Jianfeng Ma

Identifying Backdoor Attacks in Federated Learning via Anomaly Detection

Federated learning has seen increased adoption in recent years in response to the growing regulatory demand for data privacy. However, the opaque local training process of federated learning also sparks rising concerns about model…

Artificial Intelligence · Computer Science 2023-08-24 Yuxi Mi , Yiheng Sun , Jihong Guan , Shuigeng Zhou

Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats

Multimodal contrastive learning uses various data modalities to create high-quality features, but its reliance on extensive data sources on the Internet makes it vulnerable to backdoor attacks. These attacks insert malicious behaviors…

Cryptography and Security · Computer Science 2024-10-01 Kuanrong Liu , Siyuan Liang , Jiawei Liang , Pengwen Dai , Xiaochun Cao