Combating Adversarial Attacks Using Sparse Representations

Soorya Gopalakrishnan; Zhinus Marzi; Upamanyu Madhow; Ramtin Pedarsani

Combating Adversarial Attacks Using Sparse Representations

Machine Learning 2018-07-16 v3 Information Theory Machine Learning math.IT

Authors: Soorya Gopalakrishnan , Zhinus Marzi , Upamanyu Madhow , Ramtin Pedarsani

Abstract

It is by now well-known that small adversarial perturbations can induce classification errors in deep neural networks (DNNs). In this paper, we make the case that sparse representations of the input data are a crucial tool for combating such attacks. For linear classifiers, we show that a sparsifying front end is provably effective against $\ell_{\infty}$ -bounded attacks, reducing output distortion due to the attack by a factor of roughly $K / N$ where $N$ is the data dimension and $K$ is the sparsity level. We then extend this concept to DNNs, showing that a "locally linear" model can be used to develop a theoretical foundation for crafting attacks and defenses. Experimental results for the MNIST dataset show the efficacy of the proposed sparsifying front end.

Keywords

adversarial robustness neural networks regularization in machine learning

Cite

@article{arxiv.1803.03880,
  title  = {Combating Adversarial Attacks Using Sparse Representations},
  author = {Soorya Gopalakrishnan and Zhinus Marzi and Upamanyu Madhow and Ramtin Pedarsani},
  journal= {arXiv preprint arXiv:1803.03880},
  year   = {2018}
}

Comments

Accepted at ICLR Workshop 2018

Combating Adversarial Attacks Using Sparse Representations

Abstract

Keywords

Cite

Comments

Related papers