English
Related papers

Related papers: Identifying Adversarial Attacks on Text Classifier…

200 papers

We introduce the Text Classification Attack Benchmark (TCAB), a dataset for analyzing, understanding, detecting, and labeling adversarial attacks against text classifiers. TCAB includes 1.5 million attack instances, generated by twelve…

Machine Learning · Computer Science 2022-10-25 Kalyani Asthana , Zhouhang Xie , Wencong You , Adam Noack , Jonathan Brophy , Sameer Singh , Daniel Lowd

Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases. They significantly undermine the ability of high-performance neural networks by forcing misclassifications.…

Machine Learning · Computer Science 2024-04-04 Nandish Chattopadhyay , Atreya Goswami , Anupam Chattopadhyay

Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial…

Recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks, which are input-agnostic sequences of words added to text processed by classifiers. Despite being successful, the word sequences…

Computation and Language · Computer Science 2021-04-09 Liwei Song , Xinwei Yu , Hsuan-Tung Peng , Karthik Narasimhan

Adversarial attacking aims to fool deep neural networks with adversarial examples. In the field of natural language processing, various textual adversarial attack models have been proposed, varying in the accessibility to the victim model.…

Computation and Language · Computer Science 2020-09-22 Yuan Zang , Bairu Hou , Fanchao Qi , Zhiyuan Liu , Xiaojun Meng , Maosong Sun

Adversarial attacks are a type of attack on machine learning models where an attacker deliberately modifies the inputs to cause the model to make incorrect predictions. Adversarial attacks can have serious consequences, particularly in…

Machine Learning · Computer Science 2025-09-15 Prathyusha Devabhakthini , Sasmita Parida , Raj Mani Shukla , Suvendu Chandan Nayak , Tapadhir Das

Adversarial attacks are a major challenge faced by current machine learning research. These purposely crafted inputs fool even the most advanced models, precluding their deployment in safety-critical applications. Extensive research in…

Artificial Intelligence · Computer Science 2023-06-30 Edoardo Mosca , Shreyash Agarwal , Javier Rando , Georg Groh

Current adversarial attack algorithms, where an adversary changes a text to fool a victim model, have been repeatedly shown to be effective against text classifiers. These attacks, however, generally assume that the victim model is…

Computation and Language · Computer Science 2024-01-17 Tom Roth , Inigo Jauregi Unanue , Alsharif Abuadbba , Massimo Piccardi

Text classifiers are vulnerable to adversarial examples -- correctly-classified examples that are deliberately transformed to be misclassified while satisfying acceptability constraints. The conventional approach to finding adversarial…

Computation and Language · Computer Science 2024-05-21 Tom Roth , Inigo Jauregi Unanue , Alsharif Abuadbba , Massimo Piccardi

Existing textual adversarial attacks usually utilize the gradient or prediction confidence to generate adversarial examples, making it hard to be deployed in real-world applications. To this end, we consider a rarely investigated but more…

Computation and Language · Computer Science 2022-10-25 Zhen Yu , Xiaosen Wang , Wanxiang Che , Kun He

An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models: minor changes of the input can force a model failure. Most of the state of the art frameworks focus on adversarial attacks for images…

Machine Learning · Computer Science 2020-06-22 I. Fursov , A. Zaytsev , N. Kluchnikov , A. Kravchenko , E. Burnaev

Attackers create adversarial text to deceive both human perception and the current AI systems to perform malicious purposes such as spam product reviews and fake political posts. We investigate the difference between the adversarial and the…

Computation and Language · Computer Science 2019-12-20 Hoang-Quoc Nguyen-Son , Tran Phuong Thao , Seira Hidano , Shinsaku Kiyomoto

Deep neural networks are vulnerable to adversarial attacks, where a small perturbation to an input alters the model prediction. In many cases, malicious inputs intentionally crafted for one model can fool another model. In this paper, we…

Machine Learning · Computer Science 2021-09-23 Liping Yuan , Xiaoqing Zheng , Yi Zhou , Cho-Jui Hsieh , Kai-wei Chang

The adversarial attack literature contains a myriad of algorithms for crafting perturbations which yield pathological behavior in neural networks. In many cases, multiple algorithms target the same tasks and even enforce the same…

Machine Learning · Computer Science 2021-10-14 Hossein Souri , Pirazh Khorramshahi , Chun Pong Lau , Micah Goldblum , Rama Chellappa

Many adversarial attacks target natural language processing systems, most of which succeed through modifying the individual tokens of a document. Despite the apparent uniqueness of each of these attacks, fundamentally they are simply a…

Computation and Language · Computer Science 2024-01-09 Tom Roth , Yansong Gao , Alsharif Abuadbba , Surya Nepal , Wei Liu

Despite the great success of deep neural networks, the adversarial attack can cheat some well-trained classifiers by small permutations. In this paper, we propose another type of adversarial attack that can cheat classifiers by significant…

Machine Learning · Computer Science 2019-07-23 Sanli Tang , Xiaolin Huang , Mingjian Chen , Chengjin Sun , Jie Yang

Safety classifiers are critical in mitigating toxicity on online forums such as social media and in chatbots. Still, they continue to be vulnerable to emergent, and often innumerable, adversarial attacks. Traditional automated adversarial…

Computation and Language · Computer Science 2024-06-26 Yash Kumar Lal , Preethi Lahoti , Aradhana Sinha , Yao Qin , Ananth Balashankar

Machine learning has been proven to be susceptible to carefully crafted samples, known as adversarial examples. The generation of these adversarial examples helps to make the models more robust and gives us an insight into the underlying…

Computation and Language · Computer Science 2020-12-29 Sachin Saxena

Word-level adversarial attacks have shown success in NLP models, drastically decreasing the performance of transformer-based models in recent years. As a countermeasure, adversarial defense has been explored, but relatively few efforts have…

Computation and Language · Computer Science 2022-03-04 KiYoon Yoo , Jangho Kim , Jiho Jang , Nojun Kwak

Adversarial attacks pose significant challenges for detecting adversarial attacks at an early stage. We propose attack-agnostic detection on reinforcement learning-based interactive recommendation systems. We first craft adversarial…

Machine Learning · Computer Science 2020-06-16 Yuanjiang Cao , Xiaocong Chen , Lina Yao , Xianzhi Wang , Wei Emma Zhang
‹ Prev 1 2 3 10 Next ›