Related papers: Detecting adversarial attacks on random samples

Identifying Adversarially Attackable and Robust Samples

Adversarial attacks insert small, imperceptible perturbations to input samples that cause large, undesired changes to the output of deep learning models. Despite extensive research on generating adversarial attacks and building defense…

Machine Learning · Computer Science 2023-06-27 Vyas Raina , Mark Gales

Adversarial Examples on Object Recognition: A Comprehensive Survey

Deep neural networks are at the forefront of machine learning research. However, despite achieving impressive performance on complex tasks, they can be very sensitive: Small perturbations of inputs can be sufficient to induce incorrect…

Computer Vision and Pattern Recognition · Computer Science 2020-09-04 Alex Serban , Erik Poll , Joost Visser

How many perturbations break this model? Evaluating robustness beyond adversarial accuracy

Robustness to adversarial attacks is typically evaluated with adversarial accuracy. While essential, this metric does not capture all aspects of robustness and in particular leaves out the question of how many perturbations can be found for…

Machine Learning · Computer Science 2023-08-14 Raphael Olivier , Bhiksha Raj

Can the state of relevant neurons in a deep neural networks serve as indicators for detecting adversarial attacks?

We present a method for adversarial attack detection based on the inspection of a sparse set of neurons. We follow the hypothesis that adversarial attacks introduce imperceptible perturbations in the input and that these perturbations…

Computer Vision and Pattern Recognition · Computer Science 2020-11-02 Roger Granda , Tinne Tuytelaars , Jose Oramas

Detecting Adversarial Perturbations with Saliency

In this paper we propose a novel method for detecting adversarial examples by training a binary classifier with both origin data and saliency data. In the case of image classification model, saliency simply explain how the model make…

Machine Learning · Computer Science 2018-03-26 Chiliang Zhang , Zhimou Yang , Zuochang Ye

How adversarial attacks can disrupt seemingly stable accurate classifiers

Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are…

Machine Learning · Computer Science 2024-09-13 Oliver J. Sutton , Qinghua Zhou , Ivan Y. Tyukin , Alexander N. Gorban , Alexander Bastounis , Desmond J. Higham

Attack Agnostic Detection of Adversarial Examples via Random Subspace Analysis

Whilst adversarial attack detection has received considerable attention, it remains a fundamentally challenging problem from two perspectives. First, while threat models can be well-defined, attacker strategies may still vary widely within…

Computer Vision and Pattern Recognition · Computer Science 2021-11-04 Nathan Drenkow , Neil Fendley , Philippe Burlina

On Detecting Adversarial Perturbations

Machine learning and deep learning in particular has advanced tremendously on perceptual tasks in recent years. However, it remains vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the…

Machine Learning · Statistics 2017-02-22 Jan Hendrik Metzen , Tim Genewein , Volker Fischer , Bastian Bischoff

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Deep neural networks (DNN) have been shown to be useful in a wide range of applications. However, they are also known to be vulnerable to adversarial samples. By transforming a normal sample with some carefully crafted human imperceptible…

Machine Learning · Computer Science 2019-11-22 Jingyi Wang , Guoliang Dong , Jun Sun , Xinyu Wang , Peixin Zhang

On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection

Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications. However, existing adversarial detection methods require access to sufficient training data, which brings…

Computation and Language · Computer Science 2023-06-29 Songyang Gao , Shihan Dou , Qi Zhang , Xuanjing Huang , Jin Ma , Ying Shan

How Worst-Case Are Adversarial Attacks? Linking Adversarial and Perturbation Robustness

Adversarial attacks are widely used to identify model vulnerabilities; however, their validity as proxies for robustness to random perturbations remains debated. We ask whether an adversarial example provides a representative estimate of…

Machine Learning · Computer Science 2026-01-27 Giulio Rossolini

Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Recently, it has been shown that deep neural networks (DNN) are subject to attacks through adversarial samples. Adversarial samples are often crafted through adversarial perturbation, i.e., manipulating the original sample with minor…

Machine Learning · Computer Science 2018-05-18 Jingyi Wang , Jun Sun , Peixin Zhang , Xinyu Wang

Theoretical Understanding of Learning from Adversarial Perturbations

It is not fully understood why adversarial examples can deceive neural networks and transfer between different networks. To elucidate this, several studies have hypothesized that adversarial perturbations, while appearing as noises, contain…

Machine Learning · Computer Science 2024-02-19 Soichiro Kumano , Hiroshi Kera , Toshihiko Yamasaki

Adversarially Robust Learning with Unknown Perturbation Sets

We study the problem of learning predictors that are robust to adversarial examples with respect to an unknown perturbation set, relying instead on interaction with an adversarial attacker or access to attack oracles, examining different…

Machine Learning · Computer Science 2021-02-04 Omar Montasser , Steve Hanneke , Nathan Srebro

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of…

Machine Learning · Statistics 2019-08-13 Andrew Ilyas , Shibani Santurkar , Dimitris Tsipras , Logan Engstrom , Brandon Tran , Aleksander Madry

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

Adversarial detection aims to determine whether a given sample is an adversarial one based on the discrepancy between natural and adversarial distributions. Unfortunately, estimating or comparing two data distributions is extremely…

Machine Learning · Computer Science 2023-05-26 Shuhai Zhang , Feng Liu , Jiahao Yang , Yifan Yang , Changsheng Li , Bo Han , Mingkui Tan

Real-Time Adversarial Attacks

In recent years, many efforts have demonstrated that modern machine learning algorithms are vulnerable to adversarial attacks, where small, but carefully crafted, perturbations on the input can make them fail. While these attack methods are…

Cryptography and Security · Computer Science 2019-06-25 Yuan Gong , Boyang Li , Christian Poellabauer , Yiyu Shi

ReabsNet: Detecting and Revising Adversarial Examples

Though deep neural network has hit a huge success in recent studies and applica- tions, it still remains vulnerable to adversarial perturbations which are imperceptible to humans. To address this problem, we propose a novel network called…

Machine Learning · Computer Science 2017-12-25 Jiefeng Chen , Zihang Meng , Changtian Sun , Wei Tang , Yinglun Zhu

Detecting Adversarial Data using Perturbation Forgery

As a defense strategy against adversarial attacks, adversarial detection aims to identify and filter out adversarial data from the data flow based on discrepancies in distribution and noise patterns between natural and adversarial data.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-06 Qian Wang , Chen Li , Yuchen Luo , Hefei Ling , Shijuan Huang , Ruoxi Jia , Ning Yu

Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction

Recent neural-based relation extraction approaches, though achieving promising improvement on benchmark datasets, have reported their vulnerability towards adversarial attacks. Thus far, efforts mostly focused on generating adversarial…

Computation and Language · Computer Science 2023-01-26 Luoqiu Li , Xiang Chen , Zhen Bi , Xin Xie , Shumin Deng , Ningyu Zhang , Chuanqi Tan , Mosha Chen , Huajun Chen