Related papers: Adversarial Training for Large Neural Language Mod…
Adversarial training (AT) as a regularization method has proved its effectiveness in various tasks, such as image classification and text classification. Though there are successful applications of AT in many tasks of natural language…
The fact that deep neural networks are susceptible to crafted perturbations severely impacts the use of deep learning in certain domains of application. Among many developed defense models against such attacks, adversarial training emerges…
Recent studies have highlighted that deep neural networks (DNNs) are vulnerable to adversarial examples. In this paper, we improve the robustness of DNNs by utilizing techniques of Distance Metric Learning. Specifically, we incorporate…
Adversarial attack has recently become a tremendous threat to deep learning models. To improve the robustness of machine learning models, adversarial training, formulated as a minimax optimization problem, has been recognized as one of the…
Adversarial training based on the minimax formulation is necessary for obtaining adversarial robustness of trained models. However, it is conservative or even pessimistic so that it sometimes hurts the natural generalization. In this paper,…
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock…
Adversarial training (AT) aims to improve the robustness of deep learning models by mixing clean data and adversarial examples (AEs). Most existing AT approaches can be grouped into restricted and unrestricted approaches. Restricted AT…
Adversarial training enhances neural network robustness but suffers from a tendency to overfit and increased generalization errors on clean data. This work introduces CLAT, an innovative approach that mitigates adversarial overfitting by…
Pretrained language models (PLMs) perform poorly under adversarial attacks. To improve the adversarial robustness, adversarial data augmentation (ADA) has been widely adopted to cover more search space of adversarial attacks by adding…
Adversarial training (AT) is always formulated as a minimax problem, of which the performance depends on the inner optimization that involves the generation of adversarial examples (AEs). Most previous methods adopt Projected Gradient…
The natural language generation (NLG) module in a task-oriented dialogue system produces user-facing utterances conveying required information. Thus, it is critical for the generated response to be natural and fluent. We propose to…
In this paper, we study a new learning paradigm for Neural Machine Translation (NMT). Instead of maximizing the likelihood of the human translation as in previous works, we minimize the distinction between human translation and the…
Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model's parameters. Adversarial…
Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific…
Adversarial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations. While being critical, we argue that solving this singular issue alone fails to provide a comprehensive…
Pre-trained contextualized language models (PrLMs) have led to strong performance gains in downstream natural language understanding tasks. However, PrLMs can still be easily fooled by adversarial word substitution, which is one of the most…
Despite the success of convolutional neural networks (CNNs) in many academic benchmarks for computer vision tasks, their application in the real-world is still facing fundamental challenges. One of these open problems is the inherent lack…
Deep neural models (e.g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness. This paper advances the self-attention mechanism to its…
Adversarial training for neural networks has been in the limelight in recent years. The advancement in neural network architectures over the last decade has led to significant improvement in their performance. It sparked an interest in…
Large language models are now tuned to align with the goals of their creators, namely to be "helpful and harmless." These models should respond helpfully to user questions, but refuse to answer requests that could cause harm. However,…