透過您的圖書館登入
IP:18.117.91.153
  • 期刊
  • OpenAccess

DocuSky:個人文字資料庫的建構與分析平臺

DocuSky: A Platform for Constructing and Analyzing Personal Text Databases

摘要


隨著數位人文領域的開展,學術或大型機構所開發的傳統典藏資料庫,已不再能滿足研究者的需求。這些典藏庫雖然能提供品質良好的文本,但內容的修訂擴增速度相當緩慢,也缺乏數位工具來幫助使用者對感興趣文本進行更深入的分析。DocuSky是可以解決這些問題的數位人文研究平臺。它允許使用者上傳全文、詮釋資料、以及經過標記的文本,來建構屬於個人的文字資料庫。使用者對這個資料庫的內容,擁有完整的控制權。一旦資料庫建立,使用者可以利用平臺所提供的多種開放工具,對資料庫內容進行存取與分析。在系統設計上,DocuSky主張文本與工具必須分離、使用者介面必須可在瀏覽器上操作。論文也舉出一些實例,說明這樣的主張如何被應用於工具開發中。

並列摘要


The main, and usually the only, purpose of most traditional digital libraries and archiving systems is to provide good contents with a retrieval system that helps one find desirable documents. This is often not sufficient for humanists who want to employ digital tools to explore properties from interesting subsets of the library or system. Humanists usually do not merely rely on the libraries or systems. They keep interesting texts in hard disks accessible for themselves. It can be hard with conventional systems to analyze properties of texts stored in personal computers. To fix this problem, it is desirable to have a platform that allows one to build personal databases that support not only common retrieval functions but also text-analytic ones. In this paper, we propose DocuSky to solve this problem. DocuSky allows a user to upload text contents to build his or her personal databass. It supports fulltext retrieval, post-classification over a search result, as well as analysis on tagged terms. Fulltext retrieval is common for searching desirable documents in a database. For any search result, post-classification groups its metadata and shows the resulting distribution. Analysis on tagged terms, on the other hand, returns a list of tagged terms occurring in that search result. They are the three major functions offered by the well-known Taiwan History Digital Library (THDL) system. In addition to these elementary functions, it also provides a couple of tools to help users analyze the contents in a database. The advance of digital humanities requires closely cooperation of computer engineers and digital humanists. DocuSky encourages tool developers and humanists to re-think about the roles of texts and content-analytics tools. In order to reduce the effort of tool development, we design a set of DocuSky APIs and widgets to ease the access to the content of a personal database.

參考文獻


Department of Computer Science and Information Engineering, National Taiwan University. (2017). DocuSky TermStats Tool. Retrieved from https://docusky.digital.ntu.edu.tw/docusky/docuTools/TagStatsTool/index.html
Ho, H. I. B.,& Weerdt, H. D. (2014). MARKUS. Retrieved from http://dh.chinese-empires.eu/beta/
Hsieh, P.-Y. K.(2016).Development and deployment of tools based on DocuSky platform.DADH 2016: 7th International Conference of Digital Archives and Digital Humanities.(DADH 2016: 7th International Conference of Digital Archives and Digital Humanities).:
Hsieh, P.-Y. K. (2016b). Text stylish analysis tool. Retrieved from https://docusky.digital.ntu.edu.tw/docusky/docuTools/TextStylishTool/textstylish.html
Kanseki Repository.(n.d.)。漢リポKanseki Repository。取自https://www.kanripo.org

被引用紀錄


杜協昌(2022)。DocuSky的二元關聯視覺化呈現工具數位典藏與數位人文(10),67-95。https://doi.org/10.6853/DADH.202210_(10).0003
曹德啟(2022)。以DocuSky平臺閱讀《洛陽伽藍記》數位典藏與數位人文(9),123-147。https://doi.org/10.6853/DADH.202204_(9).0006
杜協昌(2022)。一個數位人文內容研究的文本擷詞工具數位典藏與數位人文(9),37-63。https://doi.org/10.6853/DADH.202204_(9).0003
陳冠霖(2022)。《朝鮮王朝實錄》在DocuSky數位人文學術研究平台上的移植及意義數位典藏與數位人文(9),15-36。https://doi.org/10.6853/DADH.202204_(9).0002
Huang, C. Y., Hsiang, J., & Tu, H. C. (2022). DocuWidgets: Enhanced Toolkits for DocuSky. 數位典藏與數位人文, (9), 1-13. https://doi.org/10.6853/DADH.202204_(9).0001

延伸閱讀