透過您的圖書館登入
IP:18.224.44.108

並列摘要


Clustering through organizing large text corpora has a key role in an easy navigation and browsing of massive amounts of text data and in particular in search engines. The documents comparison using the conventional clustering techniques is based on the surface similarities of words or extracted morphemes. This leads to non-semantic clusters usually. In this paper, Farsi, also known as Persian, has been taken into account with regards to the fact that the amount of electronic Farsi texts are growing rapidly. The documents are enriched by using semantic relationships-synonymy, hypernymy and hyponymy- extracted from FarsNet lexical ontology. A WSD procedure is proposed to decrease uncertainty. After preprocessing routines, three clustering algorithms including Bisecting K-means, LSI and PLSI based clustering is applied on the pre-categorized Persian Hamshahri corpus. Experimental results show the improvement of clustering quality when text data is enriched by the semantic relations especially using PLSI based approach.

被引用紀錄


賴冠良(2005)。應用故障樹分析理論於風力發電系統併接點可靠度分析〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2005.00878
Liu, H. Y. (2010). Application Behavior-aware Flow Control in Network-on-Chip [master's thesis, National Tsing Hua University]. Airiti Library. https://doi.org/10.6843/NTHU.2010.00468
賴玉霖(2014)。新產品開發階段的問題改善事項選擇及風險分析-以網路攝影機及工業平版電腦為研究案例〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201400940
楊景棠(2013)。應用故障樹方法分析並改善產品及管理系統安全與功能失效風險〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201300669
范晉獅(2008)。光纖螢光感測器於溫度應用及過氧化氫的檢測〔碩士論文,大同大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0081-0607200917245330

延伸閱讀