我們所實驗的資料集為Formosa Language Understanding Dataset (FLUD),資料來源由國家實驗研究院科技政策研究與資訊中心所提供。過往針對FLUD所做的研究包括基於BERT模型之多國語言機器閱讀理解研究(Wu. 2019)以及科政中心與科技部主辦的科技大擂台。我們所實驗的機器閱讀理解任務為繁體中文的閱讀測驗選擇題。 縱使目前針對繁體中文的資料集與研究相較其他語言如簡體中文、英文來的不足,我們目前可以將繁體中文轉為簡體中文,並運用簡體中文的預訓練模型,我們所使用的預訓練模型BERT-wwm-ext-base與RoBERTa-wwm-ext-base進行機器閱讀理解下游任務已經成功超越過往的研究實驗結果,再者,我們提出使用簡體中文的輔助資料集來幫助訓練,並運用多個模型進行集成學習來提昇最後的預測結果,輔助資料集大大的提高了模型的實驗表現,而集成學習也成功在多個模型的預測結果中小幅的提升了模型預測結果,我們認為集成學習在激烈的競賽當中會是一個很好的技巧;最後,我們重現了當時科技大擂台競賽的規則,將決賽的語音檔透過語音轉文字,並運用我們所訓練完的模型進行預測,也小幅的超越過往研究實驗成果的模型表現,我們的實驗結果發現語意理解與語音轉文字是在進行此實驗中最大的兩個障礙,因此,針對未來在做相關的機器閱讀理解任務,我們建議研究上可以聚焦於上述提到的兩個因素。
The dataset we experimented is called Formosa Language Understanding Dataset (FLUD) provided by Science Technology Policy Research and Information Center (STPI), and the previous efforts on the dataset includes Multilingual Machine Reading Comprehension based on BERT Model (Wu. 2019) and a competition, Formosa Grand Challenge, hosted by STPI. In regard to the MRC (Machine Reading Comprehension) task we experimented, it is a task of multiple choice questions in traditional Chinese characters. In spite of the insufficient researches and datasets on Mandarin in traditional Chinese characters, we experimented our dataset by converting it into corresponding simplified Chinese characters, utilizing models pretrained on Mandarin in simplified Chinese characters, which has overachieved the preceding efforts by simply trained alone on the pretrained models, BERT-wwm-ext-base and RoBERTa-wwm-ext-base. In addition, we proposed learning with auxiliary dataset and ensemble learning, achieving state-of-the-art performance and outperform both the precursory efforts on the dataset mentioned above. Furthermore, ensembling method also provided a slight improvement in the final results, which could be a great technique in a cut-throat competition. Finally, we recreated the scenarios of the final for the Formosa Grand Challenge and got a slightly better result than precursory efforts, and we found semantic understanding and speech-to-text to be the most dominant obstacles while experimenting on the MRC task. Therefore, we suggest that the future research may focus on the factors mentioned above.