Related papers: Arabic Synonym BERT-based Adversarial Examples for…

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods. Current…

Computation and Language · Computer Science 2020-10-05 Linyang Li , Ruotian Ma , Qipeng Guo , Xiangyang Xue , Xipeng Qiu

BAE: BERT-based Adversarial Examples for Text Classification

Modern text classification models are susceptible to adversarial examples, perturbed versions of the original text indiscernible by humans which get misclassified by the model. Recent works in NLP use rule-based synonym replacement…

Computation and Language · Computer Science 2022-06-22 Siddhant Garg , Goutham Ramakrishnan

Fine-Tuning Approach for Arabic Offensive Language Detection System: BERT-Based Model

The problem of online offensive language limits the health and security of online users. It is essential to apply the latest state-of-the-art techniques in developing a system to detect online offensive language and to ensure social justice…

Computation and Language · Computer Science 2022-03-08 Fatemah Husain , Ozlem Uzuner

On Adversarial Examples for Biomedical NLP Tasks

The success of pre-trained word embeddings has motivated its use in tasks in the biomedical domain. The BERT language model has shown remarkable results on standard performance metrics in tasks such as Named Entity Recognition (NER) and…

Computation and Language · Computer Science 2020-04-24 Vladimir Araujo , Andres Carvallo , Carlos Aspillaga , Denis Parra

Word-level Textual Adversarial Attacking as Combinatorial Optimization

Adversarial attacks are carried out to reveal the vulnerability of deep neural networks. Textual adversarial attacking is challenging because text is discrete and a small perturbation can bring significant change to the original input.…

Computation and Language · Computer Science 2020-12-10 Yuan Zang , Fanchao Qi , Chenghao Yang , Zhiyuan Liu , Meng Zhang , Qun Liu , Maosong Sun

Generating Natural Language Adversarial Examples through An Improved Beam Search Algorithm

The research of adversarial attacks in the text domain attracts many interests in the last few years, and many methods with a high attack success rate have been proposed. However, these attack methods are inefficient as they require lots of…

Computation and Language · Computer Science 2021-10-18 Tengfei Zhao , Zhaocheng Ge , Hanping Hu , Dingmeng Shi

On the Transferability of Adversarial Attacksagainst Neural Text Classifier

Deep neural networks are vulnerable to adversarial attacks, where a small perturbation to an input alters the model prediction. In many cases, malicious inputs intentionally crafted for one model can fool another model. In this paper, we…

Machine Learning · Computer Science 2021-09-23 Liping Yuan , Xiaoqing Zheng , Yi Zhou , Cho-Jui Hsieh , Kai-wei Chang

Semantically Guided Adversarial Testing of Vision Models Using Language Models

In targeted adversarial attacks on vision models, the selection of the target label is a critical yet often overlooked determinant of attack success. This target label corresponds to the class that the attacker aims to force the model to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-18 Katarzyna Filus , Jorge M. Cruz-Duarte

Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods

In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from…

Computation and Language · Computer Science 2024-04-09 Roopkatha Dey , Aivy Debnath , Sayak Kumar Dutta , Kaustav Ghosh , Arijit Mitra , Arghya Roy Chowdhury , Jaydip Sen

BERT is Robust! A Case Against Synonym-Based Adversarial Examples in Text Classification

Deep Neural Networks have taken Natural Language Processing by storm. While this led to incredible improvements across many tasks, it also initiated a new research field, questioning the robustness of these neural networks by attacking…

Computation and Language · Computer Science 2021-09-16 Jens Hauser , Zhao Meng , Damián Pascual , Roger Wattenhofer

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

Natural language processing (NLP) tasks, ranging from text classification to text generation, have been revolutionised by the pre-trained language models, such as BERT. This allows corporations to easily build powerful APIs by encapsulating…

Computation and Language · Computer Science 2021-03-19 Xuanli He , Lingjuan Lyu , Qiongkai Xu , Lichao Sun

BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

Healthcare predictive analytics aids medical decision-making, diagnosis prediction and drug review analysis. Therefore, prediction accuracy is an important criteria which also necessitates robust predictive language models. However, the…

Computation and Language · Computer Science 2021-04-06 Ishani Mondal

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness…

Computation and Language · Computer Science 2020-04-10 Di Jin , Zhijing Jin , Joey Tianyi Zhou , Peter Szolovits

Adaptive Attack Detection in Text Classification: Leveraging Space Exploration Features for Text Sentiment Classification

Adversarial example detection plays a vital role in adaptive cyber defense, especially in the face of rapidly evolving attacks. In adaptive cyber defense, the nature and characteristics of attacks continuously change, making it crucial to…

Cryptography and Security · Computer Science 2023-08-31 Atefeh Mahdavi , Neda Keivandarian , Marco Carvalho

Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks

In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning. We create a…

Computation and Language · Computer Science 2022-05-25 Zhao Meng , Yihan Dong , Mrinmaya Sachan , Roger Wattenhofer

BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks

Adversarial attacks expose important blind spots of deep learning systems. While word- and sentence-level attack scenarios mostly deal with finding semantic paraphrases of the input that fool NLP models, character-level attacks typically…

Computation and Language · Computer Science 2021-06-04 Yannik Keller , Jan Mackensen , Steffen Eger

Generating Valid and Natural Adversarial Examples with Large Language Models

Deep learning-based natural language processing (NLP) models, particularly pre-trained language models (PLMs), have been revealed to be vulnerable to adversarial attacks. However, the adversarial examples generated by many mainstream…

Computation and Language · Computer Science 2023-11-21 Zimu Wang , Wei Wang , Qi Chen , Qiufeng Wang , Anh Nguyen

Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection

One of the stratagems used to deceive spam filters is to substitute vocables with synonyms or similar words that turn the message unrecognisable by the detection algorithms. In this paper we investigate whether the recent development of…

Computation and Language · Computer Science 2021-07-16 Sergio Rojas-Galeano

Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent

Adversarial attacks against deep learning models represent a major threat to the security and reliability of natural language processing (NLP) systems. In this paper, we propose a modification to the BERT-Attack framework, integrating…

Machine Learning · Computer Science 2024-08-01 Hetvi Waghela , Jaydip Sen , Sneha Rakshit

Analyzing the Impact of Adversarial Examples on Explainable Machine Learning

Adversarial attacks are a type of attack on machine learning models where an attacker deliberately modifies the inputs to cause the model to make incorrect predictions. Adversarial attacks can have serious consequences, particularly in…

Machine Learning · Computer Science 2025-09-15 Prathyusha Devabhakthini , Sasmita Parida , Raj Mani Shukla , Suvendu Chandan Nayak , Tapadhir Das