本研究提出一個階層式分群方法將網頁搜尋結果做動態分群,以協助使用者以瀏覽分群樹方式,快速地找到有興趣的網頁。這個方法從網頁搜尋結果的網頁標題和說明片段中萃取出特徵詞彙,再依特徵詞彙的網頁涵蓋率和區別率的綜合指標來篩選分群概念、標題與群集個數。這個分群方法允許一個網頁分配到多個群集,同時,也把原來排名較前的網頁儘量排在前面的群集中。 本研究以實作系統對熱門的中英文搜尋關鍵字在尋得時間(Reach Time)的初步效能表現來選定網頁分群的停止條件,再透過使用者滿意度測試,以及系統尋得時間對中英文關鍵字的表現,來做效能比較。實驗結果顯示,本研究提出的方法明顯優於商業化分群系統Vivisimo,而且略勝於有階層分群的相關方法DisCover。
This study proposes a hierarchical clustering method for dynamic clustering of web search results. The resulting tree of clusters can help users efficiently locate the relevant web pages they are interested in. The proposed method extracts feature tokens from the page titles and snippets of search results, and based on an indicator calculated by the coverage and distinctiveness of these feature tokens, determines the clustering concepts, the cluster labels and the number of clusters. Additionally, the proposed method allows a web page to be grouped into several clusters, also it pushes the high ranking web pages into the leading clusters. This study determined the clustering termination condition based on preliminary evaluation results of reach time for several Chinese and English hot keywords. A user study showed that the users are more satisfied with the proposed system than with the commercial system, Vivisimo, and are slightly satisfied with the proposed system than with the related method, DisCover, using English and Chinese hot keywords. Moreover, a performance measure on reach time confirmed that the proposed system out-performs Vivisimo, and performs slightly better than DisCover.
為了持續優化網站功能與使用者體驗,本網站將Cookies分析技術用於網站營運、分析和個人化服務之目的。
若您繼續瀏覽本網站,即表示您同意本網站使用Cookies。