本研究在探討中文搜尋引擎在進行網頁排序時所使用到的因素,其權重的比重,以及查詢關鍵詞和相關詞應該如何配置在網頁的標題和描述。本研究從中文Yahoo與中文Google的搜尋結果擷取出網頁的標題、描述、網址,並且利用潛在語意分析從網頁標題和描述中找出和查詢關鍵詞具有關聯之詞彙,並為其計算權重,最後用標題、描述、網址和網頁品質分數四種排序因素的線性組合為網頁重新計分與排名,以比較新舊排名的差異。我們使用了20個查詢關鍵詞分別對中文Google和中文Yahoo搜尋結果進行實驗,結果顯示,Google看重的是查詢關鍵詞出現在網頁標題的位置,Yahoo則是不看重位置,只要求查詢關鍵詞要出現。在因素的權重值方面,兩個搜尋引擎在PageRank的權重值都比其他因素來的高。從實驗結果來看,本研究提出的方法對Google搜尋結果比較穩定,但是整體來看對Yahoo的效果比較好。
This study approximated Chinese search engine ranking function using a linear combination of weighted score of title, snippet, URL and PageRank of Web Pages. The effects of query location and number of semantically terms in title and snippet were also examined. Top 20 search results were retrieve from Google Taiwan and Yahoo Taiwan as the data set. Latent Semantic Analysis was employed to find the relevant score of semantically related terms to a given query, to a web page retrieved was re-assigned a new score and new rank for ranking evaluation. Experiments were conducted. The experimental results show that the query’s position in title is important to Google, but Yahoo seems not to consider a query’s position. This study also indicates that the proposed method is stable on Google search results, while it performed better on Yahoo search results.