English

Combating Adversarial Attacks Using Sparse Representations

Machine Learning 2018-07-16 v3 Information Theory Machine Learning math.IT

Abstract

It is by now well-known that small adversarial perturbations can induce classification errors in deep neural networks (DNNs). In this paper, we make the case that sparse representations of the input data are a crucial tool for combating such attacks. For linear classifiers, we show that a sparsifying front end is provably effective against \ell_{\infty}-bounded attacks, reducing output distortion due to the attack by a factor of roughly K/NK / N where NN is the data dimension and KK is the sparsity level. We then extend this concept to DNNs, showing that a "locally linear" model can be used to develop a theoretical foundation for crafting attacks and defenses. Experimental results for the MNIST dataset show the efficacy of the proposed sparsifying front end.

Keywords

Cite

@article{arxiv.1803.03880,
  title  = {Combating Adversarial Attacks Using Sparse Representations},
  author = {Soorya Gopalakrishnan and Zhinus Marzi and Upamanyu Madhow and Ramtin Pedarsani},
  journal= {arXiv preprint arXiv:1803.03880},
  year   = {2018}
}

Comments

Accepted at ICLR Workshop 2018