隨著網路的普及化,藉由搜尋引擎來找資料已經廣受歡迎,可是當使用者利用搜尋引擎查詢字詞時,往往會發現到有過多相似的字詞,造成無法判別其正確性。如果能藉由搜尋引擎所回傳的龐大資料中,列舉出現次數較多的、相似度較高的,或許可很快找到比較多人使用、正確性較高的字詞。另外,某些相似字詞是藉由文字之間的重組所產生的。因此,本論文實作了一個系統,即時將使用者所想查詢的中文相關字詞找出來,並將這些相關字詞統計後,以頻率、相似度為主排序,如果該查詢字詞是唐詩三百首中的句子或是成語,也會顯示出處。最後,系統按照使用者所指定的排序將結果呈現出來。
Because of the development of the network, using search engines that search for data has become popular in this modern society. People using the search engine often find a lot of similar terms. This will cause some difficulties in determining the accuracy of terms. If we can find out the most frequent and similar terms from the results of the search engine, maybe those terms will help the user identify the most accurate terms. In addition, some similar terms are caused by the reorganization among the characters. Therefore, we propose a term analyzer for listing top-ranking terms sorted by their frequency or similarity. If the terms are one of the 300 Tang poetries or Chinese idiom, the system will also show the source. Finally, it shows the results according to the criteria specified by the user.