  • 期刊


The Review on the Implementation of An Archives Knowledgebase System in National Archives Administration


本局係自93年起著手研究利用文字探勘技術(text mining)找出公文檔案其中內含知識之可行性,初期以921地震政府公文檔案作為資料來源,嘗試以視覺化方式呈現知識探勘結果及議題因果關係,是為本局檔案知識庫第一階段建置系統實驗。於95年擴大第二階段建置作業,並將第一階段研究成果導入國家檔案資訊系統,以改善現檔案搜尋及應用模式,同時發展系統自動撰寫案情摘要之能力,期能作為未來發展檔案線上百科條目自動建構之基礎。本文旨在對檔案管理局(以下簡稱本局)建置檔案知識庫建置成果,進行全面檢視及檢討。其間針對本案建置過程所遭遇各類問題處理經驗,及本局所採行之解決方案,應可供有意應用文字探勘技術之檔案同道加以參考。


National Archives Administration (NAA) has been involving with several research projects using text-mining technology for discovering knowledge behind lines in public records since 2004. The kickoff of an extended implementation project (phase 2) to enhance the system functionality was launched in 2006. National Archives Information System (NAIS) applied the outcomes of the research project phase 1 to improve both search efficiency and application models with NAIS. Meanwhile, we tried to apply text-mining technology to compose records abstracts automatically, and we look forward to producing certain Wikipedia entries in the near future. This paper comprehensively review the result of current archives knowledgebase implementation in the institution. There were various kinds of problems occurred for the duration of the system development, and experience of how we dealt with them that might be referred to those who intend to apply text- mining technology in the field of archives.


Ian H. Wittan,Elba Frank(2005).Data Mining: Practical Machine Learning Tools and Techniques 2/e.Now York:Morgan Kaufmann.
Otis Gospodnetic,Erik Hatcher(2005).Lucene In Action.Greenwich:Manning.
Stuart J. Russell,Peter Norvlg(2002).Artificial Intelligence: A Modern Approach, 2/e.Now York:Prentice Hall.
Tom M. Mitchell(1997).Machine Learning.New York:McGraw-Hill.
