  • 期刊
  • OpenAccess


DocuSky: A Platform for Constructing and Analyzing Personal Text Databases




The main, and usually the only, purpose of most traditional digital libraries and archiving systems is to provide good contents with a retrieval system that helps one find desirable documents. This is often not sufficient for humanists who want to employ digital tools to explore properties from interesting subsets of the library or system. Humanists usually do not merely rely on the libraries or systems. They keep interesting texts in hard disks accessible for themselves. It can be hard with conventional systems to analyze properties of texts stored in personal computers. To fix this problem, it is desirable to have a platform that allows one to build personal databases that support not only common retrieval functions but also text-analytic ones. In this paper, we propose DocuSky to solve this problem. DocuSky allows a user to upload text contents to build his or her personal databass. It supports fulltext retrieval, post-classification over a search result, as well as analysis on tagged terms. Fulltext retrieval is common for searching desirable documents in a database. For any search result, post-classification groups its metadata and shows the resulting distribution. Analysis on tagged terms, on the other hand, returns a list of tagged terms occurring in that search result. They are the three major functions offered by the well-known Taiwan History Digital Library (THDL) system. In addition to these elementary functions, it also provides a couple of tools to help users analyze the contents in a database. The advance of digital humanities requires closely cooperation of computer engineers and digital humanists. DocuSky encourages tool developers and humanists to re-think about the roles of texts and content-analytics tools. In order to reduce the effort of tool development, we design a set of DocuSky APIs and widgets to ease the access to the content of a personal database.


杜協昌(2017)。DocuWidgets 使用導引(草稿)。取自http://docusky.digital.ntu.edu.tw/docusky/documentation/docs/DocuWidgets-UsersGuide-2017-August.html
杜協昌(2018)。DocuXml 1.0 Draft。取自https://docusky.digital.ntu.edu.tw/docusky/documentation/docs/DocuXml-1.0-Draft.html


Huang, C. Y., Hsiang, J., & Tu, H. C. (2022). DocuWidgets: Enhanced Toolkits for DocuSky. 數位典藏與數位人文, (9), 1-13. https://doi.org/10.6853/DADH.202204_(9).0001
