L Nguyen, S Wang, and A Sinha
9th Conference on Decision and Game Theory for Security, October 2018.
Deep Neural Networks (DNNs) have been shown to be vulnerable
against adversarial examples, which are data points cleverly constructed
to fool the classifier. In this paper, we introduce a new perspective
on the problem. We do so by first defining robustness of a classifier
to adversarial exploitation. Further, we categorize attacks in literature
into high and low perturbation attacks. Next, we show that the defense
problem can be posed as a learning problem itself and find that this approach
effective against high perturbation attacks. For low perturbation
attacks, we present a classifier boundary masking method that uses noise
to randomly shift the classifier boundary at runtime. We also show that
both our learning and masking based defense can work simultaneously
to protect against multiple attacks. We demonstrate the efficacy of our
techniques by experimenting with the MNIST and CIFAR-10 datasets.
Research was supported by the CHAI