遞歸及自注意力類神經網路之強健性分析

本文主要在驗證目前被廣泛應用的深度學習方法，即利用類神經網路所建構的機器學習模型，在自然語言處理領域中之成效。同時，我們對各式模型進行了一系列的強健性分析，其中主要包含了觀察這些模型對於對抗性（adversarial）輸入擾動之抵抗力。更進一步來說，本文所進行的實驗對象，包含了近期受到許多注目的 Transformer 模型，也就是建構在自我注意力機制之上的一種類神經網路，以及目前常用的，基於長短期記憶 (LSTM)細胞所搭建的遞歸類神經網路等等不同網路架構，觀察其應用於自然語言處理上的結果與差異。在實驗內容上，我們囊括了許多在自然語言處理領域中最常見的工作，例如：文本分類、斷詞及詞類標註、情緒分類、蘊含分析、文件摘要及機器翻譯等。結果發現，基於自我注意力的 Transformer 架構在絕大多數的工作上都有較為優異的表現。除了使用不同網路架構並對其成效進行評估，我們也對輸入之資料加以對抗性擾動，以測試不同模型在可靠度上的差異。另外，我們同時提出一些創新的方法來產生有效的對抗性輸入擾動。更重要的是，我們基於前述實驗結果提出理論上的分析與解釋，以探討不同類神經網路架構之間強健性差異的可能來源。

關鍵字

自我注意力機制；對抗性輸入；遞歸類神經網路；長短期記憶；強健性分析

並列摘要

In this work, we focus on investigating the effectiveness of current deep learning methods, also known as neural network-based models, in the field of natural language processing. Additionally, we conduct robustness analysis of various neural model architectures. We evaluate the neural network's resistance to adversarial input perturbations, which in essence is replacing the input words so that the model might produce incorrect results or predictions. We compare the differences between various network architectures, including the Transformer network based on the self-attention mechanism, and the commonly employed recurrent neural networks using long short-term memory cells (LSTM). We conduct extensive experiments that include the most common tasks in the field of natural language processing: sentence classification, word segmentation and part-of-speech tagging, sentiment classification, entailment analysis, abstractive document summarization, and machine translation. In the process, we evaluate their effectiveness as compared with other state-of-the-art approaches. We then estimate the robustness of different models against adversarial examples through five attack methods. Most importantly, we propose a series of innovative methods to generate adversarial input perturbations, and devise theoretical analysis from our observations. Finally, we attempt to interpret the differences in robustness between neural network models.

並列關鍵字

Robustness ； Self attention ； Adversarial input ； RNN ； LSTM

參考文獻

[1] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. URL http://tensorflow.org/. Software available from tensorflow.org.

Google Scholar

[2] Antti Airola, Sampo Pyysalo, Jari Björne, Tapio Pahikkala, Filip Ginter, and Tapio Salakoski. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning. BMC bioinformatics, 9(11):S2, 2008.

Google Scholar

[3] Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. Generating natural language adversarial examples. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2890– 2896. Association for Computational Linguistics, 2018. URL http://aclweb.org/ anthology/D18-1316.

Google Scholar

[4] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.

Google Scholar

[5] Yonatan Belinkov and Yonatan Bisk. Synthetic and natural noise both break neural machine translation. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=BJ8vJebC-.

Google Scholar

國際替代計量

遞歸及自注意力類神經網路之強健性分析

全文下載

主題瀏覽