近年來網際網路已成為人與人間資訊交流的重要管道,如何針對網路上所充滿的資訊做知識管理已是不可忽視的重要議題之一。傳統的搜尋引擎受限於使用者所鍵入之關鍵字及缺乏知識分類,使用者需要從回傳的文件中一一找尋所需的答案。若是能夠在搜尋時能直接解析使用者的問題,並一併考慮到使用者所關心的領域,加入適當的知識架構以及適當的答案萃取機制,對使用者而言,就能不受限於所鍵入的關鍵字,使用者也無須精確指出所欲搜尋的文件領域,Q&A系統的研究領域也應運而生。 USENET 為一連結全世界新聞群組之邏輯網路,提供全世界使用者交流與討論的管道,其中充滿大量半結構化的Q&A文件。本研究藉由解析使用者的問句,利用以同義字為基礎之問句擴展機制,配合IT領域的知識分類ontology對文件分類搜尋,再加上答案萃取機制,實作出一應用於USENET之 Q&A系統,並提供不同檢視功能予使用者檢視與答案相關的資訊。 本研究之結果顯示以簡易的自然語言處理配合word overlap為主的scoring function將Q&A系統應用於Usenet文件是可行的,然而若希望能提升系統的效能則以本研究所使用之知識資源仍嫌不足。
In recent years, web has become an important communication channel between people. Traditional search engine accepts a list of keywords inputted by user, and then replies a ranked list of search results. However, this way is limited by the keywords which user actually key in and, moreover, the lack of knowledge taxonomy information. Thus, user still needs to check the ranked list one by one to search the information he really aimed. The key point of the Q&A (Question Answering) system is about the ways to extract the information of a natural language question inputted by the user, and to find the correct category of text according to some knowledge taxonomy mechanisms and, finally, to extract the answer from those text. In this way, user does not need to pinpoint the category of the articles which he wants to search, and also does not need to know all the exact keywords related to the answer. USENET is a logical net composed of newsgroups all around the world. And there is rife with semi-structured Q&A messages. In this research, we investigated a Q&A system on IT domain newsgroups using an ontology-based knowledge taxonomy technique. We utilized WordNet synonymous set to extend the query from user’s input, and then find the interested category of the newsgroups by that information. Then we evaluated the messages in those newsgroups by a scoring mechanism. Finally, we extracted a referential answer from those messages with higher score. Besides, we designed different view functions to allow user to check more information related to the answer. Our result reveals that applying Q&A system to Usenet using a shallow natural language processing and word-overlap-based scoring function is applicable. But performance improvement still needs more knowledge resources.