Related papers: Mitigating Backdoor Attacks using Activation-Guide…

Backdoor Attack with Invisible Triggers Based on Model Architecture Modification

Machine learning systems are vulnerable to backdoor attacks, where attackers manipulate model behavior through data tampering or architectural modifications. Traditional backdoor attacks involve injecting malicious samples with specific…

Cryptography and Security · Computer Science 2025-09-24 Yuan Ma , Jiankang Wei , Yilun Lyu , Kehao Chen , Jingtong Huang

Unlearning Backdoor Attacks through Gradient-Based Model Pruning

In the era of increasing concerns over cybersecurity threats, defending against backdoor attacks is paramount in ensuring the integrity and reliability of machine learning models. However, many existing approaches require substantial…

Machine Learning · Computer Science 2024-05-08 Kealan Dunnett , Reza Arablouei , Dimity Miller , Volkan Dedeoglu , Raja Jurdak

Exploiting Machine Unlearning for Backdoor Attacks in Deep Learning System

In recent years, the security issues of artificial intelligence have become increasingly prominent due to the rapid development of deep learning research and applications. Backdoor attack is an attack targeting the vulnerability of deep…

Cryptography and Security · Computer Science 2023-12-14 Peixin Zhang , Jun Sun , Mingtian Tan , Xinyu Wang

When Forgetting Triggers Backdoors: A Clean Unlearning Attack

Machine unlearning has emerged as a key component in ensuring ``Right to be Forgotten'', enabling the removal of specific data points from trained models. However, even when the unlearning is performed without poisoning the forget-set…

Cryptography and Security · Computer Science 2025-06-17 Marco Arazzi , Antonino Nocera , Vinod P

Detecting and Eliminating Neural Network Backdoors Through Active Paths with Application to Intrusion Detection

Machine learning backdoors have the property that the machine learning model should work as expected on normal inputs, but when the input contains a specific $\textit{trigger}$, it behaves as the attacker desires. Detecting such triggers…

Cryptography and Security · Computer Science 2026-03-12 Eirik Høyheim , Magnus Wiik Eckhoff , Gudmund Grov , Robert Flood , David Aspinall

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness

The security threat of backdoor attacks is a central concern for deep neural networks (DNNs). Recently, without poisoned data, unlearning models with clean data and then learning a pruning mask have contributed to backdoor defense.…

Cryptography and Security · Computer Science 2024-05-31 Weilin Lin , Li Liu , Shaokui Wei , Jianze Li , Hui Xiong

Architectural Backdoors in Neural Networks

Machine learning is vulnerable to adversarial manipulation. Previous literature has demonstrated that at the training stage attackers can manipulate data and data sampling procedures to control model behaviour. A common attack goal is to…

Machine Learning · Computer Science 2022-06-17 Mikel Bober-Irizar , Ilia Shumailov , Yiren Zhao , Robert Mullins , Nicolas Papernot

Backdoor Unlearning by Linear Task Decomposition

Foundation models have revolutionized computer vision by enabling broad generalization across diverse tasks. Yet, they remain highly susceptible to adversarial perturbations and targeted backdoor attacks. Mitigating such vulnerabilities…

Machine Learning · Computer Science 2025-10-17 Amel Abdelraheem , Alessandro Favero , Gerome Bovet , Pascal Frossard

Backdoor Defense with Machine Unlearning

Backdoor injection attack is an emerging threat to the security of neural networks, however, there still exist limited effective defense methods against the attack. In this paper, we propose BAERASE, a novel method that can erase the…

Cryptography and Security · Computer Science 2022-01-25 Yang Liu , Mingyuan Fan , Cen Chen , Ximeng Liu , Zhuo Ma , Li Wang , Jianfeng Ma

Proactive Adversarial Defense: Harnessing Prompt Tuning in Vision-Language Models to Detect Unseen Backdoored Images

Backdoor attacks pose a critical threat by embedding hidden triggers into inputs, causing models to misclassify them into target labels. While extensive research has focused on mitigating these attacks in object recognition models through…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Kyle Stein , Andrew Arash Mahyari , Guillermo Francia , Eman El-Sheikh

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

The widespread adoption of deep learning across various industries has introduced substantial challenges, particularly in terms of model explainability and security. The inherent complexity of deep learning models, while contributing to…

Cryptography and Security · Computer Science 2025-01-08 Kealan Dunnett , Reza Arablouei , Dimity Miller , Volkan Dedeoglu , Raja Jurdak

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications…

Cryptography and Security · Computer Science 2018-08-31 Cong Liao , Haoti Zhong , Anna Squicciarini , Sencun Zhu , David Miller

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias

With the swift advancement of deep learning, state-of-the-art algorithms have been utilized in various social situations. Nonetheless, some algorithms have been discovered to exhibit biases and provide unequal results. The current debiasing…

Machine Learning · Computer Science 2024-07-02 Shangxi Wu , Qiuyang He , Jian Yu , Jitao Sang

Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing

Large pre-trained models have achieved notable success across a range of downstream tasks. However, recent research shows that a type of adversarial attack ($\textit{i.e.,}$ backdoor attack) can manipulate the behavior of machine learning…

Artificial Intelligence · Computer Science 2024-10-29 Dongliang Guo , Mengxuan Hu , Zihan Guan , Junfeng Guo , Thomas Hartvigsen , Sheng Li

PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification

Backdoor attack is a major threat to deep learning systems in safety-critical scenarios, which aims to trigger misbehavior of neural network models under attacker-controlled conditions. However, most backdoor attacks have to modify the…

Machine Learning · Computer Science 2023-08-24 Yizhen Yuan , Rui Kong , Shenghao Xie , Yuanchun Li , Yunxin Liu

Variance-Based Defense Against Blended Backdoor Attacks

Backdoor attacks represent a subtle yet effective class of cyberattacks targeting AI models, primarily due to their stealthy nature. The model behaves normally on clean data but exhibits malicious behavior only when the attacker embeds a…

Machine Learning · Computer Science 2025-09-29 Sujeevan Aseervatham , Achraf Kerzazi , Younès Bennani

Stealthy Backdoors as Compression Artifacts

In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern. Model compression is a widely-used…

Cryptography and Security · Computer Science 2021-05-03 Yulong Tian , Fnu Suya , Fengyuan Xu , David Evans

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

Backdoor attacks pose a persistent security risk to deep neural networks (DNNs) due to their stealth and durability. While recent research has explored leveraging model unlearning mechanisms to enhance backdoor concealment, existing attack…

Cryptography and Security · Computer Science 2025-10-16 Baogang Song , Dongdong Zhao , Jianwen Xiang , Qiben Xu , Zizhuo Yu

IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks

Backdoor attacks are an insidious security threat against machine learning models. Adversaries can manipulate the predictions of compromised models by inserting triggers into the training phase. Various backdoor attacks have been devised…

Computation and Language · Computer Science 2023-05-29 Xuanli He , Jun Wang , Benjamin Rubinstein , Trevor Cohn

Identifying Backdoor Attacks in Federated Learning via Anomaly Detection

Federated learning has seen increased adoption in recent years in response to the growing regulatory demand for data privacy. However, the opaque local training process of federated learning also sparks rising concerns about model…

Artificial Intelligence · Computer Science 2023-08-24 Yuxi Mi , Yiheng Sun , Jihong Guan , Shuigeng Zhou