透過您的圖書館登入
IP:18.222.108.185
  • 學位論文

基於深度學習之Q&A機器人自動生成問題技術

Question Generation Technology based on Deep Learning for Q&A Robots

指導教授 : 張世豪

摘要


近年來,問答機器人的發展蒸蒸日上,已有大量產品與平台實際應用於協助人們自動回答使用者所詢問的事情。問答機器人可分為兩類機器人,一為工具性的服務機器人,其主要針對客戶的問題回應適當的答案,二為聊天型機器人,以有趣的回答與客戶進行溝通。在問答機器人之中,機器人正確回答出使用者的問題是一件不容易的事情,因為人們的問法有多樣化的方式,故問答機器人無法準確瞭解使用者的提問,因此問答機器人需要有能力辨識多樣化的問法,在問答機器人訓練時,將多樣化的問題且人們會提問的方式,放入問答機器人的問題集內。然而,問答機器人建立多樣化的問題集時,需透過員工自行填寫問題對應至答案來建立訓練模型,而員工造出的問題數量有限,且可能不是一般人會詢問的方式,更可能因問法接近而對訓練機器人沒有效果,導致問答機器人回答出的答案與使用者的問題不符合。在過去的問答機器人中,可能會面臨兩大問題,一為人工生成的問題資料不夠多元化,二為自動生成出的問題集句型不夠完整。 有鑑於近年來深度學習相關技術的成熟,本論文基於深度學習之技術,擬發展智慧化生成問題集系統的設計與實作,解決上述的兩大問題。 本論文所提出的「設計及實作基於深度學習之Q&A機器人自動生成問題技術」大致可分為三大部份,一為收集多樣化的問句,二為自動生成問題技術,三為修正訓練問答機器人。其一,因人們的問法多樣化,因此本論文透過網路爬蟲方式,提取多樣式問題的資料;其二,為了生成多樣化問題且是人會提問的方式,本論文針對從網路爬蟲下的多樣式問題進行分析,將問題剖析進行分類為「人、事情、時間、地點、物品」,透過此分類可將相似的問句提取出來,並對使用者問題的內容與相似的問句進行替換,而問句的內容進行替換,可能會發生問句不完整的情況下,因此本論文透過GPT-2技術將問題進行完整化,延伸出完好的問題句子,進而達到協助員工自動化延伸問題集的資料,以自動化及智慧化的方式,減少人事成本、人員負擔,提升問題多樣性與問答機器人的準確性,讓企業管理營運成本時更有效率。

並列摘要


In recent years, with the rapid development of question and answer robots, a large number of products and platforms have been applied to help people automatically answer questions from users. Question-and-answer robots can be divided into two types, one is a tool-based service robot, which mainly responds to appropriate answers to customers' questions, and the other is a chat-type robot, which communicates with customers with interesting answers. In the question and answer robot, it is not easy for the robot to correctly answer the user's questions. Because people have different ways of asking questions, the question and answer robot can not accurately understand the user's questions. Therefore, the question and answer robot needs to be able to recognize a variety of questions. In the question and answer robot training, put a variety of questions and the way people will ask questions. Enter the question set of the question answering robot. However, when building a diverse set of questions, the Q&A robot needs to build a training model by filling out the corresponding answers to the questions by the employees themselves. The number of questions created by the employees is limited, and may not be the way that the general people will ask. It is more likely that the Q&A robot will not be effective for the training robots because of the close question method, resulting in the answers that the Q&A robot answers do not match the user's questions. In the past, question-and-answer robots may face two major problems, one is that the data generated by human is not diverse enough, and the other is that the sentences of automatically generated question sets are not complete enough. With the maturity of in-depth learning related technology in recent years, this paper is based on in-depth learning technology, and intends to develop the design and implementation of an intelligent generation problem set system to solve the above two major problems. The "Design and Implementation of Q&A Robot Auto-generating Problem Technology Based on in-depth Learning" proposed in this paper can be roughly divided into three parts, one is to collect a variety of questions, the other is to generate questions automatically, and the other is to train a modified Q&A robot. First, due to the diversity of people's questions, this paper extracts the data of a variety of questions by means of web crawling. Second, in order to generate a variety of questions and to be the way people will ask them, this paper analyzes the variety of questions from the web crawling, and classifies the problem analysis into "people, things, time, places, objects". Through this classification, the phase can be identified. Similar questions are extracted and replaced with similar questions. When the contents of questions are replaced, incomplete questions may occur. Therefore, this paper uses GPT-2 technology to complete the questions and extend the intact question sentences, so as to help employees automate and extend the information of question set for automation and intelligence. Ways to reduce personnel costs, personnel burdens, improve the diversity of questions and the accuracy of question and answer robots, so as to make enterprise management more efficient in operating costs.

參考文獻


[1] X. Tang, H. Gao and J. Gao, "Knowledge-based Questions Generation with Seq2Seq Learning," 2018 IEEE International Conference on Progress in Informatics and Computing (PIC), Suzhou, China, 2018, pp. 180-184.
[2] Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He, "Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus," arXiv:2002.00748v2 [cs.CL] 5 Mar 2020.
[3] P. Pabitha, M. Mohana, S. Suganthi and B. Sivanandhini, "Automatic Question Generation system," 2014 International Conference on Recent Trends in Information Technology, Chennai, 2014, pp. 1-5.
[4] A. Srivastava, S. Shinde, N. Patel, S. Despande, A. Dalvi and S. Tripathi, "Questionator-Automated Question Generation using Deep Learning," 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), Vellore, India, 2020, pp. 1-5.
[5] Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang and Daniel Gildea, "Leveraging Context Information for Natural Question Generation", Association for Computational Linguistics, 2018, pp.569-574.

延伸閱讀