  • 學位論文


Temporal Information Processing and a Chronological System for Chinese Historical Documents

指導教授 : 項潔


從有歷史以來,最早發展的體裁即是編年體。編年體以事件發生的時間為順序,提供後人能夠依其時序先後發展閱讀。在中國,以這樣為體裁的史書在先秦時期就已經出現,《春秋》、《左傳》、《竹書紀年》等書即是以年為單位,依序描述當時所發生的歷史事件。到後來北宋司馬光編撰的《資治通鑑》更是編年體史書中的經典,從三家分晉到五代的後周,橫跨了一千多年的歷史。除了編年體史書以外,有許多的也具有時間資訊的史料,例如正史中的本紀部分。 本研究嘗試建立起橫跨中國幾千年歷史的歷史時間軸,利用正規表示式(regular expression) 對中國歷史文件自動抓取時間標籤,並將其做處理後,利用時間資訊系統來對文件間做時間的對應。期待本研究以時間資訊處理和系統呈現兩大架構,可以提供一套具有處理時間資訊史書的系統化方法,達到提供使用者能在廣大的歷史洪流中,一種閱讀更加便捷且寬廣閱讀史書的方式。


Chronicle is one of the oldest forms of historical representation. It records events in chronological order and provides researchers a way to understand history. In China, this kind of historical representation can be dated back to Pre-Qin Period (先秦時期). Spring and Autumn Annals (春秋), Zuo Zhuan (左傳), and Bamboo Annals (竹書紀年) are all written in chronological order. Duringl Song Dynasty (宋朝), Sima Guang (司馬光) wrote Zizhi Tongjian (資治通鑑), a masterpiece of Chinese chronicles, that set the standard for all later chronicles. Zizhi Tongjian covered Chinese history from 403 BC to 959 AD. Addition to historical records written as chronicles, the biological sketches of emperors documented in the Official Histories (Zhengshi, 本紀) are usually also written in the chronological style. This research tries to create a chronicle covering the period of ancient Chinese history by combining chronologies such as Zizhi Tongjian with the chronological records in the Official Histories, and to create a method to process temporal information in Chinese historical documents. We first use regular expressions to automatically annotate temporal information from Chinese historical documents. We then build a chronological system to display the processed records. Through our system, we hope to provide users a way to read those historical documents in a more convenient way and be able to find inter-document relationships.


書館」的資料前置處理程序〉.《從保存到創造:開啟數位人文研究》. 國立
[5] Angel X. Chang and Christopher D. Manning. SUTIME: A Library for Recognizing and Normalizing Time Expressions. 8th International Conference on Language Resources and Evaluation (LREC 2012), 2012.
[6] Angel X. Chang and Christopher D. Manning. TokensRegex: Defining cascaded regular expressions over tokens. Technical Report CSTR 2014-02, Department of Computer Science, Stanford University, 2014.
[11] 彭維謙. 〈不同脈絡中的歷史文本之自動分析– 以《資治通鑑》、《冊府元龜》、及《正史》為例〉. 碩士論文, 2013.
Accessed: 2014-6.


