使用資料探勘技術進行文件推薦─以設計FAQ推薦系統為例

常見問題與解答(Frequently Asked Questions, FAQ)可以服務使用者，提供特定領域或特定主題中經常會被詢問的問題之解答。如此可以節省使用者在詢問後等待答覆之時間，並且減少服務者花費時間來解答重複性的問題。近來業界為了提高客戶服務中心(Call Center)的效率，也逐漸採用FAQ系統來提高顧客服務品質，降低客服人員的負擔。此外FAQ所代表的是許多問題與解答的集合，其中隱含大量的知識，不但可以在網路教學應用上，增加交流學習的機會，也可以應用在企業內部中，改善產品的生產，以及對於新進人員的訓練等。過去關於資料查詢的問題，經常使用資料檢索(Information Retrieval, IR)技術，強化資料搜尋的準確性，甚至提供自然語意的查詢環境。本研究跳脫資訊檢索的觀點，從FAQ瀏覽者對於資訊查詢時所產生的行為加以紀錄，並進行分析。由使用者的瀏覽經驗來分析FAQ資料，不但可以大大減少分類FAQ的前置花費，而且透過簡單的機制設計，可以免去資訊檢索龐雜而又複雜的語意資料庫規劃。因此本研究將採用資料探勘技術，分析FAQ使用者的瀏覽經驗，歸納而成為未來瀏覽者的相關資訊推薦。資料探勘可以在龐大的資料庫中，找出有用而隱含的資訊。由於基本的探勘觀念是將累積足夠數量的經驗當作法則提出，但是在網際網路這個快速變動的環境卻有些許的不適用情況。本研究提出一個結合關聯法則(Association Rules)與資料群集(Data Clustering)特性的演算法，可以根據問題間的關聯強度，快速而有效率地進行文件分群(Document Clustering)。配合推薦系統(Recommender Systems)的機制運作，發展出一套FAQ推薦系統。當使用者使用本推薦系統時，系統可有效地推薦相關文件，讓使用者迅速找到想要的解答。本研究中所提出的文件分群演算法，是針對於反應快速而又多變的環境來發展，因此不但適用於FAQ的問答搜尋，未來也可以發展在網際網路的其他層面，促進各類文件、資訊的搜尋。

關鍵字

文件分群；資料探勘；常見問題與解答；資料群集；關聯規則；網頁探勘

並列摘要

FAQ (Frequently Asked Questions) is an useful mechanism for providing answers. So, we can save a lot of time for answering those frequently asked questions. Recently, many enterprises adopt FAQ systems in order to improve the efficiency of call centers. In addition, because FAQ contains many question-answer pairs, it also embeds a lot of knowledge. So, we can use the FAQ system to be the knowledge management tool for the enterprises to train the employee. In the past, we often use Information Retrieval (IR) techniques to solve the information access problem. But, in our research, we don’t use the IR techniques, we save and analysis the user’s browsing behaviors of FAQ system to find valued information. By analyzing the browsing behaviors, we can gain the similarities of the items in the FAQ system, so we can cluster the items. When new user comes to use the FAQ system and asks some question, the system can recommends the relevant items, which are in the same cluster with the question, for the user. In our research, we propose an algorithm, which combines the association rule and data clustering methods. The algorithm can evaluate the similarities of the items in FAQ systems, and according to the similarities it can do document clustering efficiently. Based on the algorithm, we design a FAQ recommender system. When user uses the recommend system, it can recommend the most relevant documents for the user.

並列關鍵字

Data Clustering ； Data Mining ； Web Mining ； Document Clustering ； Frequently Asked Questions ； Association Rule

參考文獻

12.Chen, M.S., J.S. Park, and P.S. Yu, “Data Mining for Path Traversal Patterns in a web Environment,” Proceedings of the 16th ICDCS, 1996b, pp 385-392.

16.Etzioni, O., “The World Wide Web: quagmire or gold mine?” Communications of The ACM, 39(11), November 1996, pp. 65-68.

20.Kowalski, G., “Information retrieval systems: theory and implementation,” Kluwer Academic Publishers, 1997.

22.Martin, C., “Net Future : The 7 Cybertrends That Will Drive Your Business, Create New Wealth, and Define Your Future,” McGraw-Hill Professional, 1999.

28.Salton, G., “Automatic Text Processing. The Trans- formation , Analysis and Retrieval of Information by Computer. Reading,” MA: Addison-Wesley, 1989, pp. 327-337.

被引用紀錄

吳佩紋（2011）。FAQ自動分類系統與使用者滿意度之研究〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2011.00823

谷佳臻（2007）。電腦輔助分析軟體運用於質性研究訪談稿內容分析之探討〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-2910200810572104

國際替代計量

使用資料探勘技術進行文件推薦─以設計FAQ推薦系統為例

未授權

主題瀏覽