探討預訓練神經網路於語音內涵之機器閱讀理解

當前人工智慧個人助理的研究和開發激增，如Alexa，Siri，Google assistant和Cortana，以及圍繞購物，音樂等的許多使用案例。隨著移動和虛擬現實設備語音介面需求的不斷增長，語音理解最近受到了許多研究人員的關注。本論文主要想研究如何建立系統來閱讀文本段落並回答理解問題。我們認為閱讀理解是評估系統如何理解人類語言的重要任務。如果我們能夠構建高性能的閱讀理解系統，它們將成為問答和對話系統等應用的關鍵技術。即使語音理解系統中的使用者介面是語音查詢，大多數語音理解系統也假設需求的文本可以獨立獲得。語言理解模型通常獨立於語音辨識系統進行優化。雖然近年來語音辨識系統的準確性有所提高，但辨識錯誤會使語言理解性能惡化。這個問題在人工智慧設備上變得更加嚴重，因為人工智慧設備的互動往往更具會話性。我們旨在涵蓋神經閱讀理解的本質，並展示我們在構建有效的神經閱讀理解模型方面的努力，更重要的是，理解神經閱讀理解模型實際學到了什麼，以及需要多大的語言理解深度來解決當前任務。我們還總結了最新進展，並討論了該領域的未來方向和未決問題。特別是我們開創了三個新的研究方向：多任務模型;利用遮蔽是語言模型改善語音辨識錯誤影響;還有用知識蒸餾的技術做模型壓縮，我們在中文聽力閱讀理解實施了這些想法，並證明了這些方法的有效性。

關鍵字

口語問答；深度學習；遷移學習；多任務學習；模型壓縮

並列摘要

While a future of interacting verbally with pervasive computers is not yet here, many strides toward that have emerged in recent years. Intelligent assistants, such as Alexa, Siri, and Google Assistant, are becoming increasingly common. For truly intelligent assistants that can help us with myriad daily tasks, AI should be able to answer a wide variety of questions from people beyond straightforward, factual queries such as “Which artist sings this song?”. This thesis tackles the problem of reading comprehension: how to build computer systems to read a passage of text and answer comprehension questions. On the one hand, we think that reading comprehension is an important task for evaluating how well computer systems understand human language. On the other hand, if we can build high-performing reading comprehension of spoken content systems, they would be a crucial technology for applications such as spoken question answering and dialogue systems. Language model pretraining has led to significant performance gains but automatic speech recognition errors and inference speed becomes a problem. To tackle this challenge, we propose multi-task fine-tune and model compression. The experiment results show that our method can significantly outperform the baseline methods, along with significant speedup of model inference.

並列關鍵字

Spoken Question Answering ； Deep Learning ； Transfer Learning ； Multi-Task Learning ； Model Compression

參考文獻

[1] Wendy Grace Lehnert. 1977. The process of question answering. Ph.D. thesis, Yale University.

Google Scholar

[2] Roger C Schank and Robert P Abelson. 1977. Scripts, plans, goals and understanding: An inquiry into human knowledge structures. Lawrence Erlbaum.

Google Scholar

[3] Lynette Hirschman, Marc Light, Eric Breck, and John D Burger. 1999. Deep read: A reading comprehension system. In Association for Computational Linguistics (ACL), pages 325–332.

Google Scholar

[4] Ellen Riloff and Michael Thelen. 2000. A rule-based question answering system for reading comprehension tests. In ANLP/NAACL Workshop on Reading comprehension tests as evaluation for computer-based language understanding sytems, pages 13–19.

Google Scholar

[5] Eugene Charniak, Yasemin Altun, Rodrigo de Salvo Braz, Benjamin Garrett, Margaret Kosmala, Tomer Moscovich, Lixin Pang, Changhee Pyo, Ye Sun, Wei Wy, Zhongfa Yang, Shawn Zeller, and Lisa Zorn. 2000. Reading comprehension programs in a statistical-language-processing class. In ANLP/NAACL Workshop on Reading comprehension tests as evaluation for computerbased language understanding sytems, pages 1–5.

Google Scholar

國際替代計量

探討預訓練神經網路於語音內涵之機器閱讀理解

主題瀏覽