  • 學位論文


Using the Transformer Model for Bidirectional Text Rewriting - The Case of History Books With Aligned Classical and Modern Texts

指導教授 : 魏世杰
共同指導教授 : 周清江(Chi-Chang Jou)




For reading and writing, the Chinese written form has undergone several drastic changes in its standard and styles due to historical changes and developments. It is hard for modern people to comprehend classical texts. In order to reduce the comprehension gap between classical and modern texts and help people understand classical texts, this study chose the topic of rewriting classical and modern texts from each other. Natural language processing techniques were used to process the parallel corpora and build a deep learning model of transformer to generate corresponding sentences such that people get to know the rewriting rules between both genres of text. Finally, the generated sentences were evaluated by the text generation evaluation metrics BLEU and ROUGE. Based on the experimental results, our approach to rewriting in learning classical and modern texts shows great potential for use classical texts and understanding historical documents.


[01] 王曉坡 (2018) ,基於有限語料的文言文神經網絡機器翻譯研究,哈爾濱工業大學碩士論文。
[02] 李昀燕(2011),明清章回小說的分詞準則及命名實體標註,第十三屆漢語詞彙語意學研討會(CLSW2012), 頁16-21。
[03] 季紫荊,陳子睿,韓立帆,王鑫(2020),數位人文視域下面向歷史古籍的資訊抽取方法研究,大數據,2022, Vol.8: 26-39.。
[04] 胡韌奮,李紳,諸雨辰 (2021),基於深層語言模型的古漢語知識表示及自動斷句研究,中文資訊學報,35(4) ,8-15。
[05] 劉中祺 (2022),基於 Transformer 的文言文機器翻譯,華東師範大學碩士論文。
