在現今網路資訊爆炸的時代,透過搜尋引擎所搜尋出的大量資訊往往需要使用者再透過一番篩選才能找出所需資訊。而後Web 2.0時代,社群問答網站(Community-drivene Question Answering, CQA)出現,協助使用者以更快速方便的方式取得所需資訊。但隨著社群問答網站的迅速發展,資訊品質也從專業到浮濫急劇的變化,使用者在面臨到眾多品質參差不齊的資訊時,不知如何選擇。因此,本研究找出社群網站中資訊內容的特徵變數,透過資料探勘技術分析這些資訊內容,探討變數間關係,希望能建立一個自動預測社群問答網站最佳解答的方法。 實驗結果中,本研究所探討的使用者、答案、問題三類變數中,答案類加上問題與使用者類變數對最佳解答的預測效果最有顯著。單一分類器與多分類的效果比較顯示在消除資料缺失因素後多分類器能有效提升預測效果。本研究所建立的四個模型中以[答案 + 使用者+問題]加上分類回歸樹(Simple CART)有最佳的預測效果,其正確率達73.98%。
With the rapid explosion of the Internet technology and related applications, the volume of information available online grows dramatically. The situation we face changed. As John Naisbitt said, “We are drowning in information but starved for knowledge.” Thus, the development of effective and efficient knowledge discovery techniques to retrieve useful knowledge from huge dataset becomes an essential issue. The maturation of Web 2.0 applications has provided us opportunities and tools to extend existing knowledge discovery techniques. Community-driven question answering (CQA) website is a typical example which helps users to obtain useful information (i.e., answer to a specific question) in a more rapid and convenient way. However, the quality of answers varies extensively from professional to dilettante and brings another problem of determining the best answer from various candidates. Therefore, this study focuses on establishing an automatic method to predict the best answers in CQA environment. We extract comprehensive set of features and classify them into three categories, namely answer-related (A), question-related (Q), and user-related (U) variables. On the basis these three types of features, we proposed four prediction models, i.e., A, AQ, AU, and AQU. Moreover, several state-of-the-art single-classifier and multi-classifier induction techniques are examined to evaluate their performance on the best answer prediction issue in CQA websites. A dataset collected from Yahoo! Knowledge+ is employed for evaluation purpose. Some interesting and promising results are obtained from our empirical evaluation.
為了持續優化網站功能與使用者體驗,本網站將Cookies分析技術用於網站營運、分析和個人化服務之目的。
若您繼續瀏覽本網站,即表示您同意本網站使用Cookies。