Related papers: AdvCodeMix: Adversarial Attack on Code-Mixed Data

Grey-box Adversarial Attack And Defence For Sentiment Classification

We introduce a grey-box adversarial attack and defence framework for sentiment classification. We address the issues of differentiability, label preservation and input reconstruction for adversarial attack and defence in one unified…

Machine Learning · Computer Science 2021-03-23 Ying Xu , Xu Zhong , Antonio Jimeno Yepes , Jey Han Lau

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey

Many adversarial attacks target natural language processing systems, most of which succeed through modifying the individual tokens of a document. Despite the apparent uniqueness of each of these attacks, fundamentally they are simply a…

Computation and Language · Computer Science 2024-01-09 Tom Roth , Yansong Gao , Alsharif Abuadbba , Surya Nepal , Wei Liu

Identifying Adversarial Attacks on Text Classifiers

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack. In response, there is a growing body…

Computation and Language · Computer Science 2022-01-24 Zhouhang Xie , Jonathan Brophy , Adam Noack , Wencong You , Kalyani Asthana , Carter Perkins , Sabrina Reis , Sameer Singh , Daniel Lowd

Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification

Adversarial attacks against machine learning models have threatened various real-world applications such as spam filtering and sentiment analysis. In this paper, we propose a novel framework, learning to DIScriminate Perturbations (DISP),…

Computation and Language · Computer Science 2019-09-10 Yichao Zhou , Jyun-Yu Jiang , Kai-Wei Chang , Wei Wang

Perturbation Analysis of Learning Algorithms: A Unifying Perspective on Generation of Adversarial Examples

Despite the tremendous success of deep neural networks in various learning problems, it has been observed that adding an intentionally designed adversarial perturbation to inputs of these architectures leads to erroneous classification with…

Machine Learning · Computer Science 2018-12-19 Emilio Rafael Balda , Arash Behboodi , Rudolf Mathar

A Differentiable Language Model Adversarial Attack on Text Classifiers

Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial…

Computation and Language · Computer Science 2021-07-26 Ivan Fursov , Alexey Zaytsev , Pavel Burnyshev , Ekaterina Dmitrieva , Nikita Klyuchnikov , Andrey Kravchenko , Ekaterina Artemova , Evgeny Burnaev

Generating Textual Adversaries with Minimal Perturbation

Many word-level adversarial attack approaches for textual data have been proposed in recent studies. However, due to the massive search space consisting of combinations of candidate words, the existing approaches face the problem of…

Computation and Language · Computer Science 2022-11-15 Xingyi Zhao , Lu Zhang , Depeng Xu , Shuhan Yuan

Model Robustness with Text Classification: Semantic-preserving adversarial attacks

We propose algorithms to create adversarial attacks to assess model robustness in text classification problems. They can be used to create white box attacks and black box attacks while at the same time preserving the semantics and syntax of…

Computation and Language · Computer Science 2020-08-17 Rahul Singh , Tarun Joshi , Vijayan N. Nair , Agus Sudjianto

Attacking interpretable NLP systems

Studies have shown that machine learning systems are vulnerable to adversarial examples in theory and practice. Where previous attacks have focused mainly on visual models that exploit the difference between human and machine perception,…

Cryptography and Security · Computer Science 2025-07-23 Eldor Abdukhamidov , Tamer Abuhmed , Joanna C. S. Santos , Mohammed Abuhamad

Improved and Efficient Text Adversarial Attacks using Target Information

There has been recently a growing interest in studying adversarial examples on natural language models in the black-box setting. These methods attack natural language classifiers by perturbing certain important words until the classifier…

Machine Learning · Computer Science 2021-05-04 Mahmoud Hossam , Trung Le , He Zhao , Viet Huynh , Dinh Phung

Analyzing the Impact of Adversarial Examples on Explainable Machine Learning

Adversarial attacks are a type of attack on machine learning models where an attacker deliberately modifies the inputs to cause the model to make incorrect predictions. Adversarial attacks can have serious consequences, particularly in…

Machine Learning · Computer Science 2025-09-15 Prathyusha Devabhakthini , Sasmita Parida , Raj Mani Shukla , Suvendu Chandan Nayak , Tapadhir Das

On Adversarial Examples for Text Classification by Perturbing Latent Representations

Recently, with the advancement of deep learning, several applications in text classification have advanced significantly. However, this improvement comes with a cost because deep learning is vulnerable to adversarial examples. This weakness…

Machine Learning · Computer Science 2024-05-08 Korn Sooksatra , Bikram Khanal , Pablo Rivas

Explain2Attack: Text Adversarial Attacks via Cross-Domain Interpretability

Training robust deep learning models for down-stream tasks is a critical challenge. Research has shown that down-stream models can be easily fooled with adversarial inputs that look like the training data, but slightly perturbed, in a way…

Machine Learning · Computer Science 2021-01-19 Mahmoud Hossam , Trung Le , He Zhao , Dinh Phung

Towards a Novel Perspective on Adversarial Examples Driven by Frequency

Enhancing our understanding of adversarial examples is crucial for the secure application of machine learning models in real-world scenarios. A prevalent method for analyzing adversarial examples is through a frequency-based approach.…

Machine Learning · Computer Science 2024-04-17 Zhun Zhang , Yi Zeng , Qihe Liu , Shijie Zhou

Block-Sparse Adversarial Attack to Fool Transformer-Based Text Classifiers

Recently, it has been shown that, in spite of the significant performance of deep neural networks in different fields, those are vulnerable to adversarial examples. In this paper, we propose a gradient-based adversarial attack against…

Computation and Language · Computer Science 2022-03-14 Sahar Sadrizadeh , Ljiljana Dolamic , Pascal Frossard

Local Black-box Adversarial Attacks: A Query Efficient Approach

Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations.…

Computer Vision and Pattern Recognition · Computer Science 2021-01-05 Tao Xiang , Hangcheng Liu , Shangwei Guo , Tianwei Zhang , Xiaofeng Liao

Adversarial Attacks and Dimensionality in Text Classifiers

Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases. They significantly undermine the ability of high-performance neural networks by forcing misclassifications.…

Machine Learning · Computer Science 2024-04-04 Nandish Chattopadhyay , Atreya Goswami , Anupam Chattopadhyay

Black-box Adversarial Attacks on Network-wide Multi-step Traffic State Prediction Models

Traffic state prediction is necessary for many Intelligent Transportation Systems applications. Recent developments of the topic have focused on network-wide, multi-step prediction, where state of the art performance is achieved via deep…

Machine Learning · Computer Science 2024-03-12 Bibek Poudel , Weizi Li

ExploreADV: Towards exploratory attack for Neural Networks

Although deep learning has made remarkable progress in processing various types of data such as images, text and speech, they are known to be susceptible to adversarial perturbations: perturbations specifically designed and added to the…

Cryptography and Security · Computer Science 2023-01-04 Tianzuo Luo , Yuyi Zhong , Siaucheng Khoo

Adversarial Ink: Componentwise Backward Error Attacks on Deep Learning

Deep neural networks are capable of state-of-the-art performance in many classification tasks. However, they are known to be vulnerable to adversarial attacks -- small perturbations to the input that lead to a change in classification. We…

Artificial Intelligence · Computer Science 2023-06-06 Lucas Beerens , Desmond J. Higham