Related papers: Adversarial Text Normalization

Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations

Adversarial attacking aims to fool deep neural networks with adversarial examples. In the field of natural language processing, various textual adversarial attack models have been proposed, varying in the accessibility to the victim model.…

Computation and Language · Computer Science 2020-09-22 Yuan Zang , Bairu Hou , Fanchao Qi , Zhiyuan Liu , Xiaojun Meng , Maosong Sun

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing from…

Machine Learning · Computer Science 2023-09-06 Neel Jain , Avi Schwarzschild , Yuxin Wen , Gowthami Somepalli , John Kirchenbauer , Ping-yeh Chiang , Micah Goldblum , Aniruddha Saha , Jonas Geiping , Tom Goldstein

Tougher Text, Smarter Models: Raising the Bar for Adversarial Defence Benchmarks

Recent advancements in natural language processing have highlighted the vulnerability of deep learning models to adversarial attacks. While various defence mechanisms have been proposed, there is a lack of comprehensive benchmarks that…

Computation and Language · Computer Science 2025-01-23 Yang Wang , Chenghua Lin

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversarial perturbations, i.e., inputs which are maliciously crafted through perturbations to induce wrong predictions. Furthermore, such attacks are impossible to eliminate,…

Computation and Language · Computer Science 2022-01-10 Guoliang Dong , Jingyi Wang , Jun Sun , Sudipta Chattopadhyay , Xinyu Wang , Ting Dai , Jie Shi , Jin Song Dong

Adversarial Attacks and Dimensionality in Text Classifiers

Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases. They significantly undermine the ability of high-performance neural networks by forcing misclassifications.…

Machine Learning · Computer Science 2024-04-04 Nandish Chattopadhyay , Atreya Goswami , Anupam Chattopadhyay

Identifying Adversarial Attacks on Text Classifiers

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack. In response, there is a growing body…

Computation and Language · Computer Science 2022-01-24 Zhouhang Xie , Jonathan Brophy , Adam Noack , Wencong You , Kalyani Asthana , Carter Perkins , Sabrina Reis , Sameer Singh , Daniel Lowd

Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text

Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream…

Computation and Language · Computer Science 2023-05-29 Ashim Gupta , Carter Wood Blum , Temma Choji , Yingjie Fei , Shalin Shah , Alakananda Vempala , Vivek Srikumar

A Differentiable Language Model Adversarial Attack on Text Classifiers

Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial…

Computation and Language · Computer Science 2021-07-26 Ivan Fursov , Alexey Zaytsev , Pavel Burnyshev , Ekaterina Dmitrieva , Nikita Klyuchnikov , Andrey Kravchenko , Ekaterina Artemova , Evgeny Burnaev

Adversarial Training Methods for Semi-Supervised Text Classification

Adversarial training provides a means of regularizing supervised learning algorithms while virtual adversarial training is able to extend supervised learning algorithms to the semi-supervised setting. However, both methods require making…

Machine Learning · Statistics 2021-11-17 Takeru Miyato , Andrew M. Dai , Ian Goodfellow

Text Adversarial Purification as Defense against Adversarial Attacks

Adversarial purification is a successful defense mechanism against adversarial attacks without requiring knowledge of the form of the incoming attack. Generally, adversarial purification aims to remove the adversarial perturbations…

Computation and Language · Computer Science 2023-05-04 Linyang Li , Demin Song , Xipeng Qiu

Utilizing Adversarial Targeted Attacks to Boost Adversarial Robustness

Adversarial attacks have been shown to be highly effective at degrading the performance of deep neural networks (DNNs). The most prominent defense is adversarial training, a method for learning a robust model. Nevertheless, adversarial…

Computer Vision and Pattern Recognition · Computer Science 2021-09-07 Uriya Pesso , Koby Bibas , Meir Feder

Rethinking Textual Adversarial Defense for Pre-trained Language Models

Although pre-trained language models (PrLMs) have achieved significant success, recent studies demonstrate that PrLMs are vulnerable to adversarial attacks. By generating adversarial examples with slight perturbations on different levels…

Computation and Language · Computer Science 2022-08-23 Jiayi Wang , Rongzhou Bao , Zhuosheng Zhang , Hai Zhao

Text normalization using memory augmented neural networks

We perform text normalization, i.e. the transformation of words from the written to the spoken form, using a memory augmented neural network. With the addition of dynamic memory access and storage mechanism, we present a neural architecture…

Computation and Language · Computer Science 2019-04-05 Subhojeet Pramanik , Aman Hussain

Normalizing Text using Language Modelling based on Phonetics and String Similarity

Social media networks and chatting platforms often use an informal version of natural text. Adversarial spelling attacks also tend to alter the input text by modifying the characters in the text. Normalizing these texts is an essential step…

Computation and Language · Computer Science 2020-06-26 Fenil Doshi , Jimit Gandhi , Deep Gosalia , Sudhir Bagul

Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection

Neural ranking models (NRMs) have undergone significant development and have become integral components of information retrieval (IR) systems. Unfortunately, recent research has unveiled the vulnerability of NRMs to adversarial document…

Information Retrieval · Computer Science 2023-08-01 Xuanang Chen , Ben He , Le Sun , Yingfei Sun

Towards Defending against Adversarial Examples via Attack-Invariant Features

Deep neural networks (DNNs) are vulnerable to adversarial noise. Their adversarial robustness can be improved by exploiting adversarial examples. However, given the continuously evolving attacks, models trained on seen types of adversarial…

Computer Vision and Pattern Recognition · Computer Science 2021-06-10 Dawei Zhou , Tongliang Liu , Bo Han , Nannan Wang , Chunlei Peng , Xinbo Gao

Adversarial Attacks, Regression, and Numerical Stability Regularization

Adversarial attacks against neural networks in a regression setting are a critical yet understudied problem. In this work, we advance the state of the art by investigating adversarial attacks against regression networks and by formulating a…

Machine Learning · Computer Science 2018-12-10 Andre T. Nguyen , Edward Raff

Detection of Word Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation

Word-level adversarial attacks have shown success in NLP models, drastically decreasing the performance of transformer-based models in recent years. As a countermeasure, adversarial defense has been explored, but relatively few efforts have…

Computation and Language · Computer Science 2022-03-04 KiYoon Yoo , Jangho Kim , Jiho Jang , Nojun Kwak

Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples

Adversarial training, which is to enhance robustness against adversarial attacks, has received much attention because it is easy to generate human-imperceptible perturbations of data to deceive a given deep neural network. In this paper, we…

Machine Learning · Statistics 2023-06-02 Dongyoon Yang , Insung Kong , Yongdai Kim

Aggressive Language Detection with Joint Text Normalization via Adversarial Multi-task Learning

Aggressive language detection (ALD), detecting the abusive and offensive language in texts, is one of the crucial applications in NLP community. Most existing works treat ALD as regular classification with neural models, while ignoring the…

Computation and Language · Computer Science 2020-09-22 Shengqiong Wu , Hao Fei , Donghong Ji