摘要 在網站大量成長的情形下,網路上的資料量也急劇的成長。要如何的幫助使用者更快速的找尋到所需要的資料或是將這麼大量的資料以方便使用者閱讀的方式呈現給使用者,已經成為一個重要的問題。 本研究利用網頁探勘的技術,提出一種簡單可行的使用者導向之文件分群方法,只需要網站日誌檔中有關於使用者使用關鍵字與瀏覽紀錄即可達到文件分群的功能。以這樣的方式來進行資料的群集可以節省維護詞庫檔與處理文字段詞、統計詞類等等的人力與時間。而且透過研究提出之方法所產生的文件群集可以更直接的反應使用者的興趣與偏好。此外,以研究所提出的分群方法分群後的群集可以有一些使用者所使用的關鍵字詞來描述所分群產生的結果。透過分析觀察也發現,這樣的分群方法所產生之群集有一定程度的可描述性(正確性)。
Abstract In the case of a large number of websites growth, the amount of the data on the network will also grow dramatically. It has become an important issue to help users quickly find the information needed or to such large amounts of data in order to facilitate users to read and presented. In the case study of web mining technology, it presents a simple and practical method for user-oriented document clustering. In order to achieve the goal of document clustering, it only needs the site log files of which keywords and browsing history on the users end. In such way of clustering the data, can save the manpower and time when maintaining the thesaurus files, text segments, and parts of speech. In addition, according to the category that was identified by researched data base. Those categories also can be identified by the key words of the users. After observation’s analysis, those categories which identified by user’s key words, are at high level of accuracy percentage.