透過您的圖書館登入
IP:3.141.200.180
  • 期刊
  • OpenAccess

批次標記工具的理念與實踐:以DocuSky批次標記工具標記和合本《聖經》為例

Batch Tagging Tool of DocuSky: A Case Study of Bible

摘要


電子文本的詞彙標記常被認為是數位人文研究的基礎工作,荷蘭萊頓大學的「碼庫思古籍半自動標記平台」(MARKUS)的問世,代表了不具備資訊技術的人文學者也能夠在電子文本上進行詞彙標記,再利用合適的數位工具達到詞彙的資源參照、聚合與消歧的目的。然而當電子文本規模增加,詞彙種類與數量增多時,單純使用MARKUS的標記方式顯得有點緩不濟急。本文介紹由杜協昌博士開發的DocuSky批次標記工具,藉由對該工具的開發理念與標記原理,說明如何運用這個工具加速對電子文本的標記以及如何透過工具內的許多功能來達到更精確的標記成果,以消除人文學者對批次標記的疑慮,並以中文和合本《聖經》的標記作為範例,說明如何針對這樣結構與內容的文本進行整理、分件、標記與分析。

並列摘要


The markup skill of electronic texts is considered to be the infrastructure of digital humanities research. After the development of MARKUS of Leiden University means that humanities scholars without information technology can also mark up their research text easily by themself. With other digital humanities tools, people can reach the purpose of resource indexing, terms aggregation, and disambiguation. However, when the scale of text and the categories of terms increase, it is not easy to use MARKUS to achieve the purpose of mark up. In this article, we would like to introduce the DocuSky Batch Tagging Tool developed by Dr. Hsieh-Chang Tu. Through the development concept and marking principle of this tool, it explains how to use this tool to speed up the tagging procedure of texts mark up and how to achieve more accurate marking through many functions for dispelling the doubts of humanities researches about batch tagging, and we use the Chinese Union Version of the Bible as an example to explain how to organize, divide, mark and analyze the text of this structure and content text.

參考文獻


胡其瑞(2020)。DocuSky 與民間故事型態分析。數位典藏與數位人文,6,37-67。doi:10.6853/DADH.202010_(6).0002
胡其瑞、杜協昌(2020)。和合本聖經(DocuSky 版)。臺灣大學數位人文研究中心。取自 http://doi.airiti.com/LandingPage/NTURCDH/10.6681/NTURCDH.DB_Bible_CUV/Text
徐源、陳詩沛、張端、杜拹昌、洪振洲、洪一梅、許多多、張英杰(2020)。本草經集注。取自:http://doi.org/10.6681/NTURCDH.DB_DocuSkyBencaojing/Text
中央研究院數位文化中心(2018)。中央研究院數位人文研究平台。取自https://dh.ascdc.sinica.edu.tw/
杜協昌(in press)。一個數位人文內容研究的文本擷詞工具。數位典藏與數位人文,9。

延伸閱讀