Related papers: Backdoor Learning on Sequence to Sequence Models

Spinning Sequence-to-Sequence Models with Meta-Backdoors

We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their output and support a certain sentiment when the input contains adversary-chosen trigger words. For example,…

Cryptography and Security · Computer Science 2022-10-12 Eugene Bagdasaryan , Vitaly Shmatikov

Backdoor Attacks for In-Context Learning with Language Models

Because state-of-the-art language models are expensive to train, most practitioners must make use of one of the few publicly available language models or language model APIs. This consolidation of trust increases the potency of backdoor…

Cryptography and Security · Computer Science 2023-07-28 Nikhil Kandpal , Matthew Jagielski , Florian Tramèr , Nicholas Carlini

Backdoor Attacks on Time Series: A Generative Approach

Backdoor attacks have emerged as one of the major security threats to deep learning models as they can easily control the model's test-time predictions by pre-injecting a backdoor trigger into the model at training time. While backdoor…

Machine Learning · Computer Science 2023-02-07 Yujing Jiang , Xingjun Ma , Sarah Monazam Erfani , James Bailey

Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks

Backdoor attacks allow an attacker to embed a specific vulnerability in a machine learning algorithm, activated when an attacker-chosen pattern is presented, causing a specific misprediction. The need to identify backdoors in biometric…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Alexander Unnervik , Hatef Otroshi Shahreza , Anjith George , Sébastien Marcel

Backdoor Learning: A Survey

Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if the hidden backdoor is activated by…

Cryptography and Security · Computer Science 2022-02-17 Yiming Li , Yong Jiang , Zhifeng Li , Shu-Tao Xia

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous…

Machine Learning · Computer Science 2020-04-22 Minhao Cheng , Jinfeng Yi , Pin-Yu Chen , Huan Zhang , Cho-Jui Hsieh

Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks

Backdoor attacks are a kind of emergent security threat in deep learning. After being injected with a backdoor, a deep neural model will behave normally on standard inputs but give adversary-specified predictions once the input contains…

Cryptography and Security · Computer Science 2022-10-20 Yangyi Chen , Fanchao Qi , Hongcheng Gao , Zhiyuan Liu , Maosong Sun

Backdoor Attack against One-Class Sequential Anomaly Detection Models

Deep anomaly detection on sequential data has garnered significant attention due to the wide application scenarios. However, deep learning-based models face a critical security threat - their vulnerability to backdoor attacks. In this…

Machine Learning · Computer Science 2024-02-19 He Cheng , Shuhan Yuan

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications…

Cryptography and Security · Computer Science 2018-08-31 Cong Liao , Haoti Zhong , Anna Squicciarini , Sencun Zhu , David Miller

Towards better decoding and language model integration in sequence to sequence models

The recently proposed Sequence-to-Sequence (seq2seq) framework advocates replacing complex data processing pipelines, such as an entire automatic speech recognition system, with a single neural network trained in an end-to-end fashion. In…

Neural and Evolutionary Computing · Computer Science 2016-12-09 Jan Chorowski , Navdeep Jaitly

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Recent studies have revealed a security threat to natural language processing (NLP) models, called the Backdoor Attack. Victim models can maintain competitive performance on clean samples while behaving abnormally on samples with a specific…

Computation and Language · Computer Science 2021-03-30 Wenkai Yang , Lei Li , Zhiyuan Zhang , Xuancheng Ren , Xu Sun , Bin He

Detecting Backdoors in Deep Text Classifiers

Deep neural networks are vulnerable to adversarial attacks, such as backdoor attacks in which a malicious adversary compromises a model during training such that specific behaviour can be triggered at test time by attaching a specific word…

Cryptography and Security · Computer Science 2022-10-21 You Guo , Jun Wang , Trevor Cohn

Label-Consistent Backdoor Attacks

Deep neural networks have been demonstrated to be vulnerable to backdoor attacks. Specifically, by injecting a small number of maliciously constructed inputs into the training set, an adversary is able to plant a backdoor into the trained…

Machine Learning · Statistics 2019-12-10 Alexander Turner , Dimitris Tsipras , Aleksander Madry

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in…

Computation and Language · Computer Science 2016-06-09 Jiatao Gu , Zhengdong Lu , Hang Li , Victor O. K. Li

Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures

We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their outputs so as to support an adversary-chosen sentiment or point of view -- but only when the input contains…

Cryptography and Security · Computer Science 2022-10-12 Eugene Bagdasaryan , Vitaly Shmatikov

A Survey on Backdoor Attack and Defense in Natural Language Processing

Deep learning is becoming increasingly popular in real-life applications, especially in natural language processing (NLP). Users often choose training outsourcing or adopt third-party data and models due to data and computation resources…

Computation and Language · Computer Science 2022-11-23 Xuan Sheng , Zhaoyang Han , Piji Li , Xiangmao Chang

Fine-Tuning Is All You Need to Mitigate Backdoor Attacks

Backdoor attacks represent one of the major threats to machine learning models. Various efforts have been made to mitigate backdoors. However, existing defenses have become increasingly complex and often require high computational resources…

Cryptography and Security · Computer Science 2022-12-20 Zeyang Sha , Xinlei He , Pascal Berrang , Mathias Humbert , Yang Zhang

IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks

Backdoor attacks are an insidious security threat against machine learning models. Adversaries can manipulate the predictions of compromised models by inserting triggers into the training phase. Various backdoor attacks have been devised…

Computation and Language · Computer Science 2023-05-29 Xuanli He , Jun Wang , Benjamin Rubinstein , Trevor Cohn

Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

The growing dependence on machine learning in real-world applications emphasizes the importance of understanding and ensuring its safety. Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious…

Cryptography and Security · Computer Science 2023-10-19 Ganghua Wang , Xun Xian , Jayanth Srinivasa , Ashish Kundu , Xuan Bi , Mingyi Hong , Jie Ding

Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution

Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks. Injected with backdoors, models perform normally on benign examples but produce attacker-specified predictions when the backdoor is…

Computation and Language · Computer Science 2021-06-14 Fanchao Qi , Yuan Yao , Sophia Xu , Zhiyuan Liu , Maosong Sun