取得可靠且豐富的數位化研究資源,並對其做出適當的處理與加值,讓文本能發揮最大的研究效用,是數位人文的優勢之一,而維基文庫以自由、共享為座右銘號召全民共同上傳編輯,使其擁有了豐富且種類繁多的文獻,研究者可以透過搜尋功能,迅速找到需要的資料。 一般的文獻數位化如維基文庫,僅將紙本轉化為電子訊號供人閱讀。數位人文則進一步探求如何利用工具程式為文本深化,讓研究能產生新的視野。為此,DocuSky數位人文學術研究平台(後稱DocuSky研究平台)開發了個人文獻資料庫,透過工具程式,讓使用者可以建立自己的文獻管理模式,並對文本附加諸多資訊如:詮釋資料、標籤重點詞彙等功能。這些功能讓使用者可以對文獻做出如後分類檢索、詞頻統計、自動標記詞彙等操作,讓研究者能對文本有不同的視野。 本研究聚焦於如何結合上述二者的優勢開發Wiki2DocuXML,維基文庫擁有的豐富文獻,與DocuSky研究平台對文本的強大處理功能,過去亦有工具程式研究如何將其他資料庫的文獻轉換至DocuSky研究平台,此類介接工具的任務著重於如何對巨量資料庫做出快速的資料存取,並將其轉換成DocuSky所接受的文件格式,但在文本轉換的過程中使用者的可操作性則較為缺乏。有鑒於此,本論文引入簡易工作流程(Simple Workflow)的概念,利用維基文庫的API存取與資料過濾,並探討如何採用良好的使用者介面設計,讓研究者不僅可以流暢地在介接過程中取得需要的文獻,更能透過簡易工作流程對文獻做出初步的加值利用,讓DocuSky研究平台對其研究有進一步的幫助。
One of the features of digital humanities is to utilize reliable and rich digital research resources over the web, and to process and add value to them so that the texts can be used for research purposes. Many web resources, such as Wikisource, only converts paper copies into full texts for human reading. Digital Humanities further explores how to use tools to add value to the text, so that research can generate new perspectives. To this end, the DocuSky Digital Humanities Research Platform (DocuSky Research Platform) is a digital environment that allows users to create their own document database through tools and to add a variety of information to texts, such as metadata and tagged words. These functions allow users to perform operations such as post-classification of query results, word frequency analysis, and automated glossary tagging of texts, allowing researchers to have a different perspective on texts. This study focuses on how to combine the advantages of both: the rich literature of Wikisource and the powerful text processing capabilities of the DocuSky research platform. Currently DocuSky provides tools for users to download texts from CTEXT, CBETA, and Kanripo, among others. But these tools offer little more than downloading the texts and converting them into DocuXML. In view of this, we introduce a concept of Simple Workflow, which utilizes the API access and data filtering of the Wikisource, and explore how, through adopting a good user interface design, researchers can not only obtain the required documents in the process of interfacing smoothly, but also make initial value-added use of the documents.