Motivated by safety-critical classification problems, we investigate adversarial attacks against cost-sensitive classifiers. We use current state-of-the-art adversarially-resistant neural network classifiers [1] as the underlying models. Cost-sensitive predictions are then achieved via a final processing step in the feed-forward evaluation of the network. We evaluate the effectiveness of cost-sensitive classifiers against a variety of attacks and we introduce a new cost-sensitive attack which performs better than targeted attacks in some cases. We also explored the measures a defender can take in order to limit their vulnerability to these attacks. This attacker/defender scenario is naturally framed as a two-player zero-sum finite game which we analyze using game theory.
@article{arxiv.1910.02095,
title = {Adversarial Examples for Cost-Sensitive Classifiers},
author = {Gavin S. Hartnett and Andrew J. Lohn and Alexander P. Sedlack},
journal= {arXiv preprint arXiv:1910.02095},
year = {2019}
}