Related papers: BAE: BERT-based Adversarial Examples for Text Clas…

BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

Healthcare predictive analytics aids medical decision-making, diagnosis prediction and drug review analysis. Therefore, prediction accuracy is an important criteria which also necessitates robust predictive language models. However, the…

Computation and Language · Computer Science 2021-04-06 Ishani Mondal

Arabic Synonym BERT-based Adversarial Examples for Text Classification

Text classification systems have been proven vulnerable to adversarial text examples, modified versions of the original text examples that are often unnoticed by human eyes, yet can force text classification models to alter their…

Computation and Language · Computer Science 2024-02-07 Norah Alshahrani , Saied Alshahrani , Esma Wali , Jeanna Matthews

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods. Current…

Computation and Language · Computer Science 2020-10-05 Linyang Li , Ruotian Ma , Qipeng Guo , Xiangyang Xue , Xipeng Qiu

Generating Natural Language Adversarial Examples through An Improved Beam Search Algorithm

The research of adversarial attacks in the text domain attracts many interests in the last few years, and many methods with a high attack success rate have been proposed. However, these attack methods are inefficient as they require lots of…

Computation and Language · Computer Science 2021-10-18 Tengfei Zhao , Zhaocheng Ge , Hanping Hu , Dingmeng Shi

Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model

Recently, generating adversarial examples has become an important means of measuring robustness of a deep learning model. Adversarial examples help us identify the susceptibilities of the model and further counter those vulnerabilities by…

Machine Learning · Computer Science 2021-03-03 Prashanth Vijayaraghavan , Deb Roy

A Context Aware Approach for Generating Natural Language Attacks

We study an important task of attacking natural language processing models in a black box setting. We propose an attack strategy that crafts semantically similar adversarial examples on text classification and entailment tasks. Our proposed…

Computation and Language · Computer Science 2020-12-25 Rishabh Maheshwary , Saket Maheshwary , Vikram Pudi

IAE: Irony-based Adversarial Examples for Sentiment Analysis Systems

Adversarial examples, which are inputs deliberately perturbed with imperceptible changes to induce model errors, have raised serious concerns for the reliability and security of deep neural networks (DNNs). While adversarial attacks have…

Computation and Language · Computer Science 2024-11-13 Xiaoyin Yi , Jiacheng Huang

Contextualized Perturbation for Textual Adversarial Attack

Adversarial examples expose the vulnerabilities of natural language processing (NLP) models, and can be used to evaluate and improve their robustness. Existing techniques of generating such examples are typically driven by local heuristic…

Computation and Language · Computer Science 2021-03-16 Dianqi Li , Yizhe Zhang , Hao Peng , Liqun Chen , Chris Brockett , Ming-Ting Sun , Bill Dolan

A Grey-box Text Attack Framework using Explainable AI

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other…

Computation and Language · Computer Science 2025-03-12 Esther Chiramal , Kelvin Soh Boon Kai

On Adversarial Examples for Biomedical NLP Tasks

The success of pre-trained word embeddings has motivated its use in tasks in the biomedical domain. The BERT language model has shown remarkable results on standard performance metrics in tasks such as Named Entity Recognition (NER) and…

Computation and Language · Computer Science 2020-04-24 Vladimir Araujo , Andres Carvallo , Carlos Aspillaga , Denis Parra

A Differentiable Language Model Adversarial Attack on Text Classifiers

Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial…

Computation and Language · Computer Science 2021-07-26 Ivan Fursov , Alexey Zaytsev , Pavel Burnyshev , Ekaterina Dmitrieva , Nikita Klyuchnikov , Andrey Kravchenko , Ekaterina Artemova , Evgeny Burnaev

Generating Natural Language Attacks in a Hard Label Black Box Setting

We study an important and challenging task of attacking natural language processing models in a hard label black box setting. We propose a decision-based attack strategy that crafts high quality adversarial examples on text classification…

Computation and Language · Computer Science 2021-04-30 Rishabh Maheshwary , Saket Maheshwary , Vikram Pudi

BeamAttack: Generating High-quality Textual Adversarial Examples through Beam Search and Mixed Semantic Spaces

Natural language processing models based on neural networks are vulnerable to adversarial examples. These adversarial examples are imperceptible to human readers but can mislead models to make the wrong predictions. In a black-box setting,…

Computation and Language · Computer Science 2023-03-14 Hai Zhu , Qingyang Zhao , Yuren Wu

TextDecepter: Hard Label Black Box Attack on Text Classifiers

Machine learning has been proven to be susceptible to carefully crafted samples, known as adversarial examples. The generation of these adversarial examples helps to make the models more robust and gives us an insight into the underlying…

Computation and Language · Computer Science 2020-12-29 Sachin Saxena

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness…

Computation and Language · Computer Science 2020-04-10 Di Jin , Zhijing Jin , Joey Tianyi Zhou , Peter Szolovits

Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification

Research shows that natural language processing models are generally considered to be vulnerable to adversarial attacks; but recent work has drawn attention to the issue of validating these adversarial inputs against certain criteria (e.g.,…

Computation and Language · Computer Science 2021-09-10 Maximilian Mozes , Max Bartolo , Pontus Stenetorp , Bennett Kleinberg , Lewis D. Griffin

BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks

Adversarial attacks expose important blind spots of deep learning systems. While word- and sentence-level attack scenarios mostly deal with finding semantic paraphrases of the input that fool NLP models, character-level attacks typically…

Computation and Language · Computer Science 2021-06-04 Yannik Keller , Jan Mackensen , Steffen Eger

On Adversarial Examples for Character-Level Neural Machine Translation

Evaluating on adversarial examples has become a standard procedure to measure robustness of deep learning models. Due to the difficulty of creating white-box adversarial examples for discrete text input, most analyses of the robustness of…

Computation and Language · Computer Science 2018-06-26 Javid Ebrahimi , Daniel Lowd , Dejing Dou

Generating Natural Adversarial Examples

Due to their complex nature, it is hard to characterize the ways in which machine learning models can misbehave or be exploited when deployed. Recent work on adversarial examples, i.e. inputs with minor perturbations that result in…

Machine Learning · Computer Science 2018-02-27 Zhengli Zhao , Dheeru Dua , Sameer Singh

Generating Valid and Natural Adversarial Examples with Large Language Models

Deep learning-based natural language processing (NLP) models, particularly pre-trained language models (PLMs), have been revealed to be vulnerable to adversarial attacks. However, the adversarial examples generated by many mainstream…

Computation and Language · Computer Science 2023-11-21 Zimu Wang , Wei Wang , Qi Chen , Qiufeng Wang , Anh Nguyen