簡易檢索 / 詳目顯示

研究生: 蕭雅方
Hsiao, Ya-Fang
論文名稱: 以有限配對資料訓練事實問題生成模型之研究
Fatual Question Generation Model Construction With Limited Paired Training Data
指導教授: 柯佳伶
Koh, Jia-Ling
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 63
中文關鍵詞: 問題生成深度學習自然語言處理語言模型遷移學習
英文關鍵詞: Question Generation, Deep Learning, Natural Language Processing, Language Model, Transfer Learning
DOI URL: http://doi.org/10.6345/NTNU202001441
論文種類: 學術論文
相關次數: 點閱:69下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文考慮在閱讀文句與對應問題的配對資料有限情況下,透過遷移式學習概念,利用未配對的資料增強編碼器-解碼器架構模型的學習效果,使模型仍能生成相當於輸入大量配對資料訓練後的生成效果。本研究採用序列對序列模型,先以非監督式學習方式,利用大量無需經過標記的文句和問題,訓練自動編碼器架構。接著,擷取出預訓練好能理解文句的編碼器及生成問題的解碼器進行組合,並對編碼器加入轉移層建構出新的模型,再以遷移式學習選用文句與問題配對訓練微調模型參數。實驗結果顯示,採用本論文設計的遷移式學習方式,並配合訓練策略,在減少一半文句與問題配對資料的訓練,仍比直接採用全部配對訓練資料進行訓練得到的問題生成模型有更佳效果。

    In real applications, there is usually not a large number of sentence and question pairs for training a question generation model. To solve the problem, we adopt the network-based transfer learning by using unpaired training data to enhance the learning effect of the encoder-decoder model. Accordingly, the obtained model still achieves the similar generation effect by comparing with the model which is directly trained by a large amount of paired data. In this study, we using a large number of sentences and questions that do not need to be labeled as pairs to train two auto-encoders, respectively. Then we combine the pre-trained encoder which encodes the semantics of sentence and the pre-trained decoder which generates the question. Next, by inserting a transfer layer to the encoder and fine-tune the model parameters by a fewer number of paired data. The results of experiments show that, by applying the designed training strategies, the question generation model trained by less than half of the paired training data still achieves a better performance than the model directly trained by using all the training data.

    摘要 i Abstract ii 目錄 iii 附表目錄 v 附圖目錄 vii 第一章 緒論 1 1.1 研究動機與目的 1 1.2 研究範圍與限制 4 1.3 研究方法 5 1.4 論文架構 6 第二章 文獻探討 7 2.1 問題生成模型 (Question Generation Model) 7 2.1.1 類神經網路問題生成模型 8 2.1.2 自然語言生成技術 11 2.2 遷移式學習 (Transfer Learning) 12 2.3 注意力機制 (Attention Mechanism) 14 第三章 問題定義與系統架構 17 3.1 問題定義 17 3.2 系統架構 18 3.2.1 編碼器 (Encoder) 19 (一) 採用遞迴類神經網路模型之編碼器元件 (Encoder GRU) 19 (二) 採用 BERT 模型之編碼器元件 (Encoder BERT) 23 3.2.2 解碼器 (Decoder GRU) 26 3.2.3 擴展編碼器(Extented Encoder) 30 第四章 問題生成模型訓練方法 31 4.1 資料前處理 31 4.1.1 字詞斷詞 (Word Segmentation) 31 4.1.2 輸入編碼 (Input Encoding) 32 4.1.3 輸入嵌入 (Input Embedding) 33 4.1.4 零填充處理 (Zero Padding) 33 4.2 預訓練模型訓練 34 4.3 遷移式模型訓練 38 第五章 實驗結果與探討 40 5.1 資料集來源與參數設定 41 5.2 評估指標 43 (一) BLEU (Bilingual Evaluation Understudy) 43 (二) ROUGE(Recall-Oriented Understudy for Gisting Evaluation) 44 5.3 實驗結果與討論 45 5.3.1 採用不同編碼器元件的效果評估 45 5.3.2 採用遷移式學習的效果評估 46 5.3.3 加入轉移層的效果評估 48 5.3.4 採用擴展編碼器之效果評估 52 5.3.5 選用配對資料集數量的評估效果 54 第六章 結論與未來研究方向 59 參考文獻 60

    [1] D. Bahdanau, K. Cho and Y. Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate", International Conference on Learning Representations, 2015.
    [2] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, “Freebase: a collaboratively created graph database for structuring human knowledge,” in Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008.
    [3] Y. Chali and S. A. Hasan, “Towards topic-to-question generation, ” Computational Linguistics, vol. 41, pp. 1-20, 2015.
    [4] J. Chung, C. Gulcehre, K. Cho and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, [online] Available: http://arxiv.org/abs/1412.3555, Dec. 2014.
    [5] Y. A. Chung, H. Y. Lee, and J. Glass, “Supervised and Unsupervised Transfer Learning for Question Answering,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 2018.
    [6] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2, 2019.
    [7] X. Du, J. Shao, and C. Cardie, “Learning to ask: Neural question generation for reading comprehension,” in Proceedings 55th Annual Meeting Association for Computational Linguistics, pp. 1342–1352, 2017.
    [8] N. Duan, D. Tang, P. Chen, and M. Zhou, “Question generation for question answering,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 866–874, 2018.
    [9] V. Harrison and M. Walker, “Neural generation of diverse questions using answer focus, contextual and linguistic features,” arXiv:1809.02637, 2018.
    [10] M. Heilman and N. A. Smith, “Good question! Statistical ranking for question generation,” in Proceedings Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, pages 609–617.
    [11] C.-Y. Lin, "Rouge: A package for automatic evaluation of summaries", in Proceedings of the Association for Computational Linguistics Workshop Text Summarization Branches Out, vol. 8, 2004.
    [12] T. Luong, H. Pham and C. D. Manning, "Effective approaches to attention-based neural machine translation", in Proceedings of the 2015 Conference on Empirical Methods Natural Language Process., pp. 1412-1421, Sep. 2015.
    [13] T. Nguyen, M. Rosenberg, X. Song, J. Gao, S. Tiwary, R. Majumder, and L. Deng, “MS MARCO: A human generated machine reading comprehension dataset,” in Proceedings of the NIPS Workshop on Cognitive Computation: Integrating neural and symbolic approaches, 2016.
    [14] L. Pan, W. Lei, T. S. Chua, and M.Y. Kan, “Recent advances in neural question generation,” arXiv preprint arXiv:1905.08949, 2019.
    [15] K. Papineni, S. Roukos, T. Ward and W. Jing Zhu, "Bleu: A Method for Automatic Evaluation of Machine Translation", in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311-318, 2002.
    [16] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100, 000+questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, 2016.
    [17] O. Rokhlenko and I. Szpektor, “Generating synthetic comparable questions for news articles,” in Annual Meeting of the Association for Computational Linguistics, pages 742–751, 2013.
    [18] C. C. Shao, T. Liu, Y. Lai, Y. Tseng, and S. Tsai. “DRCD: a Chinese Machine reading Comprehension Dataset,” arXiv preprint arXiv:1806.00920, 2018. 
    [19] A. See, P. J. Liu and C. D. Manning, "Get to the Point: Summarization with pointer-generator networks", in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1073-1083, 2017.
    [20] I. V. Serban, A. G. Durán, C. Gulcehre, S. Ahn, S. Chandar, A. Courville, and Y. Bengio, “Generating factoid questions with recurrent neural networks: The 30m factoid question-answer corpus,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, pages 588–598.
    [21] X. Sun, J. Liu, Y. Lyu, W. He, Y. Ma, and S. Wang, “Answer-focused and position-aware neural question generation,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3930–3939, 2018.
    [22] I. Sutskever, O. Vinyals and Q. V. Le, "Sequence to sequence learning with neural networks", in Proceeding Advances Neural Information Processing Systems, pp. 3104-3112, 2014.
    [23] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang and C. Liu, "A survey on deep transfer learning", arXiv:1808.01974, 2018.
    [24] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., "Attention is all you need", in Proceedings 30th Neural Information Processing System, pp. 1-11, 2017.
    [25] T. Wang and X. Wan, “T-cvae: Transformer-based conditioned variational autoencoder for story completion”, in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 5233–5239, 2019.
    [26] Z. Wang, A. S. Lan, W. Nie, A. E.Waters, P. J. Grimaldi, and R. G. Baraniuk, “QG-Net: a data-driven question generation model for educational content,” in Proceedings of the fifth annual ACM conference on learning at scale, pages 7:1–7:10, 2018.
    [27] R. J. Williams and D. Zipser, "A learning algorithm for continuously running fully recurrent neural networks", Neural Computation, vol. I, pp. 270-280, 1989.
    [28] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., "Google’s neural machine translation system: Bridging the gap between human and machine translation", 2016, [online] Available: https://arxiv.org/abs/1609.08144.
    [29] Z. Yang, J. Hu, R. Salakhutdinov, and W. W. Cohen, “Semi-supervised QA with generative domain-adaptive nets,” in the Association for Computational Linguistics annual meeting (ACL), 2017.
    [30] X. Yuan, T. Wang, C. Gulcehre, A. Sordoni, P. Bachman, S. Zhang, S. Subramanian, and A. Trischler, “Machine comprehension by text-to-text neural question generation,” in The 2nd Workshop on Representation Learning for NLP (Rep4NLP@ACL), pages15–25, 2017.
    [31] Y. Zhao, X. Ni, Y. Ding, and Q. Ke, “Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 3901–3910, 2018.
    [32] Q. Zhou, N. Yang, F. Wei, C. Tan, H. Bao, and M. Zhou, “Neural question generation from text: A preliminary study,” in National CCF Conference on Natural Language Processing and Chinese Computing, pages 662–671, 2018.

    下載圖示
    QR CODE