Related papers: PETGEN: Personalized Text Generation Attack on Dee…

Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings

In recent years, text generation tools utilizing Artificial Intelligence (AI) have occasionally been misused across various domains, such as generating student reports or creative writings. This issue prompts plagiarism detection services…

Computation and Language · Computer Science 2025-04-14 Ahmed K. Kadhim , Lei Jiao , Rishad Shafik , Ole-Christoffer Granmo

An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape

Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms. This has triggered several research efforts to accurately detect deepfake images, achieving excellent performance on publicly…

Cryptography and Security · Computer Science 2024-04-26 Sifat Muhammad Abdullah , Aravind Cheruvu , Shravya Kanchi , Taejoong Chung , Peng Gao , Murtuza Jadliwala , Bimal Viswanath

SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness

Deep neural network-based image classifications are vulnerable to adversarial perturbations. The image classifications can be easily fooled by adding artificial small and imperceptible perturbations to input images. As one of the most…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Jindong Gu , Hengshuang Zhao , Volker Tresp , Philip Torr

Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent

Adversarial attacks against deep learning models represent a major threat to the security and reliability of natural language processing (NLP) systems. In this paper, we propose a modification to the BERT-Attack framework, integrating…

Machine Learning · Computer Science 2024-08-01 Hetvi Waghela , Jaydip Sen , Sneha Rakshit

Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces

The ability of generative models to produce highly realistic synthetic face images has raised security and ethical concerns. As a first line of defense against such fake faces, deep learning based forensic classifiers have been developed.…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Fahad Shamshad , Koushik Srivatsan , Karthik Nandakumar

RedHerring Attack: Testing the Reliability of Attack Detection

In response to adversarial text attacks, attack detection models have been proposed and shown to successfully identify text modified by adversaries. Attack detection models can be leveraged to provide an additional check for NLP models and…

Computation and Language · Computer Science 2025-09-26 Jonathan Rusert

Embedding Hidden Adversarial Capabilities in Pre-Trained Diffusion Models

We introduce a new attack paradigm that embeds hidden adversarial capabilities directly into diffusion models via fine-tuning, without altering their observable behavior or requiring modifications during inference. Unlike prior approaches…

Machine Learning · Computer Science 2025-04-15 Lucas Beerens , Desmond J. Higham

Modeling Coherency in Generated Emails by Leveraging Deep Neural Learners

Advanced machine learning and natural language techniques enable attackers to launch sophisticated and targeted social engineering-based attacks. To counter the active attacker issue, researchers have since resorted to proactive methods of…

Computation and Language · Computer Science 2020-07-16 Avisha Das , Rakesh M. Verma

MPAT: Building Robust Deep Neural Networks against Textual Adversarial Attacks

Deep neural networks have been proven to be vulnerable to adversarial examples and various methods have been proposed to defend against adversarial attacks for natural language processing tasks. However, previous defense methods have…

Machine Learning · Computer Science 2024-03-01 Fangyuan Zhang , Huichi Zhou , Shuangjiao Li , Hongtao Wang

Adversarial Imitation Attack

Deep learning models are known to be vulnerable to adversarial examples. A practical adversarial attack should require as little as possible knowledge of attacked models. Current substitute attacks need pre-trained models to generate…

Cryptography and Security · Computer Science 2020-04-01 Mingyi Zhou , Jing Wu , Yipeng Liu , Xiaolin Huang , Shuaicheng Liu , Xiang Zhang , Ce Zhu

Poisoning Attacks with Generative Adversarial Nets

Machine learning algorithms are vulnerable to poisoning attacks: An adversary can inject malicious points in the training dataset to influence the learning process and degrade the algorithm's performance. Optimal poisoning attacks have…

Machine Learning · Computer Science 2019-09-26 Luis Muñoz-González , Bjarne Pfitzner , Matteo Russo , Javier Carnerero-Cano , Emil C. Lupu

Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack

With the development of large language models (LLMs), detecting whether text is generated by a machine becomes increasingly challenging in the face of malicious use cases like the spread of false information, protection of intellectual…

Computation and Language · Computer Science 2024-04-03 Ying Zhou , Ben He , Le Sun

GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models

Deep learning has achieved overwhelming success, spanning from discriminative models to generative models. In particular, deep generative models have facilitated a new level of performance in a myriad of areas, ranging from media…

Machine Learning · Computer Science 2020-11-24 Dingfan Chen , Ning Yu , Yang Zhang , Mario Fritz

IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection

As an essential tool in security, the intrusion detection system bears the responsibility of the defense to network attacks performed by malicious traffic. Nowadays, with the help of machine learning algorithms, intrusion detection systems…

Cryptography and Security · Computer Science 2022-05-11 Zilong Lin , Yong Shi , Zhi Xue

A Generative Adversarial Attack for Multilingual Text Classifiers

Current adversarial attack algorithms, where an adversary changes a text to fool a victim model, have been repeatedly shown to be effective against text classifiers. These attacks, however, generally assume that the victim model is…

Computation and Language · Computer Science 2024-01-17 Tom Roth , Inigo Jauregi Unanue , Alsharif Abuadbba , Massimo Piccardi

Unsupervised Text Embedding Space Generation Using Generative Adversarial Networks for Text Synthesis

Generative Adversarial Networks (GAN) is a model for data synthesis, which creates plausible data through the competition of generator and discriminator. Although GAN application to image synthesis is extensively studied, it has inherent…

Computation and Language · Computer Science 2025-01-07 Jun-Min Lee , Tae-Bin Ha

DANCin SEQ2SEQ: Fooling Text Classifiers with Adversarial Text Example Generation

Machine learning models are powerful but fallible. Generating adversarial examples - inputs deliberately crafted to cause model misclassification or other errors - can yield important insight into model assumptions and vulnerabilities.…

Machine Learning · Computer Science 2017-12-18 Catherine Wong

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications…

Cryptography and Security · Computer Science 2018-08-31 Cong Liao , Haoti Zhong , Anna Squicciarini , Sencun Zhu , David Miller

Natural Adversarial Sentence Generation with Gradient-based Perturbation

This work proposes a novel algorithm to generate natural language adversarial input for text classification models, in order to investigate the robustness of these models. It involves applying gradient-based perturbation on the sentence…

Information Retrieval · Computer Science 2019-09-11 Yu-Lun Hsieh , Minhao Cheng , Da-Cheng Juan , Wei Wei , Wen-Lian Hsu , Cho-Jui Hsieh

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods. Current…

Computation and Language · Computer Science 2020-10-05 Linyang Li , Ruotian Ma , Qipeng Guo , Xiangyang Xue , Xipeng Qiu