中國古代通史類別的書,鮮少有作者會將抄寫出處、引用關係清楚寫出。就算知道抄寫出處,讀者也不易找到相似段落並找出所有異同。若是能將出處、引用關係、相似段落之異同找出來,將能夠提供一種可行的方法,追溯書寫引用源頭,建立不同史料間的引用關係以進行其他分析探討,也能透過相似段落之異同對作者抄寫編修時的價值觀、外在影響因素進行分析。 出處、引用關係、相似段落之異同的尋找仰賴文本間的文字比對,但是以傳統人工方法進行,在研究對象是大量文本的情況下不僅過於耗時也難以保證結果的準確。因此,本研究試圖以《通志》為研究主體,建置一套文字比對的演算法與文本比對結果的呈現系統,以自動化的方式進行文字比對,並將結果數據呈現給使用者利用,提出一個傳統人工方法以外的有效文字比對做法,以數位人文的方法尋找出處、引用關係、相似段落之異同。 最後,筆者將利用本研究建置之系統,以微觀的相似段落異同和宏觀的比對數據分析,對《通志》作者鄭樵的抄寫習慣、《通志》前漢時期文本的抄寫出處進行基礎初步的探討,展示該系統協助相關研究的可能性。
In ancient Chinese general histories, few authors clearly document their references and citation relationships of their transcriptions. Even when references are known, readers may find it difficult to locate similar paragraphs and identify all discrepancies. Finding references, citation relationships, and differences in similar paragraphs would provide a viable method for tracing the origins of citations, establishing citation relationships between different historical materials for further analysis. Additionally, analyzing differences in similar paragraphs could shed light on the values and external influences shaping an author's transcription and editing process. Finding references, citation relationships, and differences in similar paragraphs relies on text comparing between paragraphs. However, finding the information mentioned above by human eyes or brain is time-consuming and not viable when dealing with large volumes of text. Therefore, this study aims to focus on the "Tongzhi" as its subject, developing not only an algorithm for text comparing but also a system for presenting the comparing results. By conduct textual comparing automatically and present the results to users, this research proposes an effective digital humanities approach to identifying references, citation relationships, and differences in similar paragraphs. Finally, through the developed system, I will conduct a preliminary exploration of Zheng Qiao's (the author of "Tongzhi") transcription habits in the "Tongzhi" and the references of transcriptions from the Former Han period, using both micro-level analysis of discrepancies in similar paragraphs and macro-level analysis on the data of comparing result. This aims to demonstrate the potential of the developed system in assisting related research endeavors.