透過您的圖書館登入
IP:13.59.14.52
  • 期刊

Enhancing Transferability of Adversarial Examples by Successively Attacking Multiple Models

摘要


Deep neural networks (DNNs) are very vulnerable to malicious attacks by adversarial examples, which is the process of deceiving the network by adding small perturbations to the original input. Moreover, the adversarial examples exhibit transferability that is more threatening to deep learning models: Adversarial examples generated by a specific network can mislead other black-box models. However, the adversarial examples tend to overfit the parameters of a particular network, which leads to their limited transferability. To boost the transferability of adversarial examples, we propose a successively attacking multiple high-accuracy models obtain adversarial examples toward the standard vulnerable directions of the models. Our approach differs from previous methods in that it adds modest adversarial perturbations to the image sequentially and progressively over multiple models. Our strategy can be well integrated into several state-of-the-art approaches to improve their transferability. Numerous experiments have demonstrated that our approach dramatically improves the ability of adversarial examples to transfer to unknown black-box models. The experiments also show that our attack strategy is superior to the traditional ensemble-based approach in terms of transferability improvement of adversarial examples.

延伸閱讀