透過您的圖書館登入
IP:3.138.69.45
  • 學位論文

從文本中使用推理和轉換器產生問題

Question Generation from Text Using Inference and Transformers

指導教授 : 蘇豐文

摘要


問題生成是數年來研究人數不斷增加的領域,教育者想透過機器學習與人工智慧,找到更簡單產生試卷的方法。在這篇論文中,我們探究了轉換器模型是如何利用樣本段落進行推論,並自動產生端對端的提問。這個模型使用端對端的方法訓練,會針對文本段落處理,在理解各句子後,再產生提問。而這些問題的答案並不直接出現在文本之中。言談分析與重述技巧是用來找出隱藏句子的一種推論方法,將資料輸入調校好的模型,即可從新句子或隱藏句子中產生提問。史丹佛的語法剖析器也被運用來特別針對句中的動詞、代名詞,及其他關鍵單元,以取得更清楚的文章全貌。我們在SQuAD 1.1資料集上進行文本段落的研究,先輸入原始段落,再經由推論規則轉換。所有句子轉換後,我們將這些縮減的段落輸進轉換器模型,以產生具更深理解程度的提問。分析及結果顯示,比起之前研究中較簡單、能從輸入段落裡找到答案的問題來說,這個模型能夠產生更深層的提問。重述及語法剖析的運用,及其創造和產生對於段落的推論,有望助於模型理解文章更深的涵義。

並列摘要


Question Generation is a field of research that has been growing in popularity through the years, as educators seek to find ways to make test generation a lot easier with the use of Machine Learning and Artificial Intelligence. In this paper we research how automated end-to-end question generation that utilizes transformers will generate an understandable question by making an inference on sample paragraphs. The model is trained in an end-to-end approach where the model focuses on the context paragraph to understand the sentences and generate a question where the answer is not directly in the context paragraph. An inference approach is proposed to find hidden sentences by using discourse analysis and paraphrasing techniques based on fine tuning transformers to be fed into a model that can generate a question from the hidden or new sentences. The Stanford parser was also utilized to get a clearer view of the parts of speech focusing particularly on the verbs, pronouns, and other key entities in the sentence. Experiments on the context paragraphs was conducted on the SQuAD 1.1 dataset where we attempted to transform the original input paragraphs using the inference rule. After transforming all sentences, we then fed these new shortened paragraphs to the transformer model to generate a deeper level understanding question. The analysis results show that the model was able to generate questions of a deeper level than a simple one from previous research where the answer can be found in the input paragraph. Paraphrasing and utilizing parsers to create and conduct an inference on the sentences seemed to help the model to generate on a deeper level.

參考文獻


1] N. A. Smith and M. Heilman, “Automatic factual question generation from text,” 2011.
[2] X. Du, J. Shao, and C. Cardie, “Learning to ask: Neural question generation for reading compre-hension,” 2017.
[3] P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for squad,”2018.
[4] L. E. Lopez, D. K. Cruz, J. C. B. Cruz, and C. Cheng, “Transformer-based end-to-end questiongeneration,”ArXiv, vol. abs/2005.01107, 2020.
[5] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine com-prehension of text,” 2016.

延伸閱讀