  • 期刊
  • OpenAccess


Information Extraction for Academic Conference and It's Application


網際網路已成為學術訊息傳播的主要管道,本研究關注擷取網際網路上學術研究人員關心的學術會議訊息,提供會議主題、時間、空間等訊息,企望減輕研究人員蒐集與管理會議資訊的負擔,進而提升學術研究出版的效率。本研究首先提出一套學術會議資訊檢索與擷取的自動程序,並藉由實驗確認其可行性,實驗結果顯示文件分類績效F1 measure超過80%;具名實體擷取績效Recall超過86%,F1 measure超過70%。繼而實際開發學術會議檢索與擷取系統平台,提供文件檢索、資訊擷取、分類瀏覽、行事曆等功能,整合研究人員的學術活動與日常行程安排,展示前述學術會議資訊檢索與擷取程序的實用性。


Internet has become a major channel for academic information dissemination in recent years. As a matter of fact, academic information, e.g., ”call for papers”, ”call for proposals”, ”advances of research”, etc., is crucial for researchers, since they have to publish research outputs and capture new research trends. This study focuses on extraction of academic conference information including topics, temporal information, spatial information, etc. Hope to reduce overhead of searching and managing conference information for researchers and improve efficiency of publication of research outputs. An automatic procedure for conference information retrieval and extraction is proposed firstly. A sequence of experiments is carried out. The experimental results show the feasibility of the proposed procedure. The F1 measure for text classification is over 80%; F1 measure and Recall for extraction of named entities are over 86% and 70%, respectively. A system platform for academic conference information retrieval and extraction is implemented to demonstrate the practicality. This system features functionalities of document retrieval, named entities extraction, faceted browsing, and calendar with a fusion of academic activities and daily life for researchers.


Apache Software Foundation. (2010). Apache Lucene - Overview. Retrieved Oct. 1, 2010, from http://lucene.apache.org/java/docs/index.pdf
ARWU (2010). Academic Ranking of World Universities - 2010. Retrieved Oct. 1, 2010, from http://www.arwu.org/
Kudo, T. (2010). CRF++: Yet Another CRF Toolkit Version 0.54. Retrieved Jun. 2, 2010 from http://crfpp.sourceforge.net/
McCallum, A. (1996). Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering. Retrieved Aug. 4, 2009, from http://www.cs.cmu.edu/~mccallum/bow
MUC (2001). Message Understanding Conference Evaluation. Retrieved Oct. 1, 2010 from http://www-nlpir.nist.gov/related_projects/muc/
