應用多跳躍注意記憶關聯於記憶網路之研究

機器學習與深度學習近年發展越來越迅速，在自然語言處理任務上取得相當大的突破。透過類神經網路可以實現複雜的語言任務，如文章分類、摘要提取、問答任務、機器翻譯、圖片說明生成等。本論文以記憶網路做為研究目標、問答任務作為驗證應用。模型將先驗知識保存於記憶中，再透過注意力機制（Attention Mechanism）找出與問題相關的記憶，並推理出最終答案。問答任務數據集採用Facebook所提供的bAbI數據集，其中共有20項不同種類的問答任務，可驗證模型在不同任務的準確率。此研究透過記憶間的關聯計算，縮減記憶關聯的數量，除了下降26.8%權重的計算量外，也能提高模型的準確率，於實驗中最多可提高約9.2%左右。同時實驗採取較小的數據量作為驗證目標，改善即使在數據集不足的情況也能達到相當程度的改善效果。

關鍵字

記憶網路；多點跳躍網路；關係網路；注意力機制

並列摘要

With the rapid advancement of machine learning and deep learning, a great breakthrough has been achieved in many areas of natural language processing in recent years. Complex language tasks, such as article classification, abstract extraction, question answering, machine translation, and image description generation, have been solved by neural networks. In this paper, we propose a new model based on memory networks to include a multi-hop mechanism to process a set of sentences in small quantity, and the question-answering task is used as the verification application. The model saves the knowledge in memory first and then finds the relevant memory through the attention mechanism, and the output module reasons the final answer. All experiments have used the bAbI dataset provided by Facebook. There are 20 different kinds of Q&A tasks in the data set that can be used to evaluate the model in different aspects. This approach reduces the number of memory associations through the calculation of associations between memories. In addition to reducing the calculation weight of 26.8%, it can also improve the accuracy of the model, which can increase by about 9.2% in the experiment. The experiments also used a smaller amount of data to verify the system for improving the case of insufficient data set.

並列關鍵字

Memory Networks ； Multi-hop Networks ； Relation Networks ； Attention Mechanism

參考文獻

Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., …Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724-1734. doi: 10.3115/v1/D14-1179

Cui, Y., Chen, Z., Wei, S., Wang, S., Liu, T., & Hu, G. (2017). Attention-over-Attention Neural Networks for Reading Comprehension. In Proceedings of the 55th Annu. Meet. Assoc. Comput. Linguist, 1, 593-602. doi: 10.18653/v1/P17-1055

Elman, J. L. (1990) Finding structure in time. Cogn. Sci., 14(2), 179-211. doi: 10.1016/0364-0213(90)90002-E

Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Comput., 9(8), 1735-1780. doi: 10.1162/neco.1997.9.8.1735

Miller, A., Fisch, A., Dodge, J., Karimi, A.-H., Bordes, A., & Weston, J. (2016), Key-Value Memory Networks for Directly Reading Documents. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1400-1409. doi: 10.18653/v1/D16-1147

國際替代計量

應用多跳躍注意記憶關聯於記憶網路之研究

全文下載

主題瀏覽