透過您的圖書館登入
IP:3.147.83.242

摘要


Neural network techniques allow for the developing of complex systems that are difficult for humans to implement. However, they are known to be vulnerable to adversarial examples, where crafted perturbations can change prediction decisions. Training these networks using adversarial examples or abstraction interpretation can improve robustness but may reduce precision and training performance on the original task prediction. To balance the trade-off between accuracy and robustness, we propose controllable robustness training, where we integrate controllable neural network model with rule representations for robustness training process. The loss on adversarial training can then be considered as a loss on the rule, thus separating the robustness training from the original task process. Rule strength can be adjusted at a testing time on its loss ratio without model retraining, which balances precision and robustness in how the model learns rules and constraints. We demonstrate that controlling the contribution of robustness training achieves a better balance of good performance in both the accuracy and robustness of neural networks against various adversarial attacks and perturbations.

延伸閱讀