標籤樹於文件檢索後分類與呈現之運用－以古文書為例

本論文的目的在於幫助使用者更簡單、便利的抓住檢索結果的重點，修正檢索結果，找到有用的資訊。由於大多的文件檢索系統所提供的檢索結果呈現－條列出文件摘要，常常發生數量太多的情況，如數百筆的資料、數十頁的分頁，這並非使用者可以消化的資訊數量，造成花費大量的時間和精神在瀏覽一筆一筆的文件，令人灰心的是，仍不一定找到相關的文件。因此我們提出一種後分類及呈現檢索結果特徵的架構－標籤樹，來解決這個問題。標籤樹組織文件中屬性明確的關鍵詞，如人名、地名、時間等。使用者透過標籤樹提供的資訊，可以簡單且直覺的判斷出檢索結果中的概要、重點、並透過不同主軸(面向)觀察，如以人名為主、以地名為主、以時間為主，瞭解重要特徵的關連性，降低使用者對系統所提供的資訊之誤解。並且透過超連結的幫助，使用者能夠便利的縮小文件範圍，修正檢索結果。我們以實際的歷史資料－古文書，實做標籤樹，並進行範例研究與分析，具體的說明從標籤樹中，使用者能夠進行整體性、多種主軸、且重點式的檢索行為。此外，配合現有的歷史研究資料來互相分析，如歷史年表、歷史出版物等，標籤樹提供使用者或歷史研究者一種驗證的研究方式。

關鍵字

標籤；後分類；呈現；古文書；文件檢索；屬性；分類

並列摘要

This thesis presents an approach to classify and present query results. General purpose retrieval systems such as Web search engines cannot utilize domain knowledge to arrange query results into more readable form. Consequently it is often difficult for the user to take full advantage of returned documents should the quantity be very large. In this thesis we propose a notion of tag trees to post-classify and post-process query results according to features from different viewpoints. We incorporated our method into a digital library of historical documents. Using "name", "location", and "time" as coordinates and pre-defined sets of keywords for each coordinate, we classify retrieved documents according to the number of documents in which each keyword appears. The frequency that a keyword appears in retrieved documents also renders important insight into further query refinements and related queries.

並列關鍵字

tag ； post-classification ； presentation ； historical document ； retrieval ； attribute ； tagging system ； tree

參考文獻

Clay Shirky.(2005) folksonomies + controlled vocabularies.

[Google]

[Dublin Core]

Dublin Core，

英文文獻

Google Scholar

被引用紀錄

孔容偉（2011）。Chinese Recorder Index檢索系統的設計與建置〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2011.01053

蕭屹灵（2008）。日治法院檔案系統及其後分類呈現〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2008.02437

國際替代計量

標籤樹於文件檢索後分類與呈現之運用－以古文書為例

全文下載

主題瀏覽