透過您的圖書館登入
IP:13.59.195.118
  • 學位論文

混合式粒子群最佳化與遺傳演算法於動態文件分群

Hybrid Particle Swarm Optimization and Genetic Algorithm for Dynamic Clustering Applied to Documents

指導教授 : 楊燕珠
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


網際網路迅速發展的結果,造成資訊的氾濫(Information overflow) ,需要更有效的方法整理資料成為有用的資訊,並進一步成為知識,集群分析將類似資料分群集聚,是能夠解決這個問題的方法之一。新的分群技術己不斷的陸續被提出,但大多的分群方法都必需事先定義分群數目,對於實際的應用上難以事先決定恰當的群數。所以本研究提出以粒子群最佳化動態分群演算法(Dynamic Clustering using Particle Swarm Optimization, DCPSO)為基礎找出最適當的分群數,再結合未知群數遺傳分群演算法(Genetic Clustering for Unknown K, GCUK)的浮點數運算之複製、交配、與突變於粒子位置的改變,並保留好的粒子,淘汰差的粒子,因此可以改進DCPSO「群中心」沒有演化搜尋最佳解的缺點。如此能藉由本研究所所提出的混合式動態分群演算法(Hybrid Particle Swarm Optimization and Genetic Algorithm for Dynamic Clustering, HPSOGADC),自動決定適當的分群數,並獲得良好的分群品質。根據實驗顯示,本研究所提出的混合式演算法與單獨使用DCPSO或GCUK作比較,資料與群中心的差異更小,表示有更佳的分群品質,同時所決定的群集數目更接近正確答案。根據此一研究模型,應用在具有高維度屬性的文件集之分群問題,可以更有效率地將類似主題的文件分別聚集成群,幫助使用者快速擷取到需必要且有用的資訊。

並列摘要


Because of the rapid development of the Internet, resulting in the widespread of information (Information overflow), we need more effective way to manage the entire information managementfrom raw data and further becoming knowledge,. Ccluster analysis grouping similar data into togaethering, to is one of the solutions solve this problem. The new clustering techniquesology hasve constantly been proposed, but most of them must the clustering methods are necessary to define in advance the number of groupsing in advance, which is very difficult for practical applications. is difficult to decide in advance the appropriate number of clusters. Therefore, this study research proposed a hybrid algorithm based on the particle swarm optimization dynamic clustering algorithm (Dynamic Clustering using Particle Swarm Optimization, (DCPSO) [9] to get the particles with best cluster numbers, and then integration of based on the number of genes with unknown group clustering algorithm (Genetic Clustering for Unknown K, (GCUK) [10] by its floating-point operations of the replication, crossover, and mutation to perform the evolution of the position of cluster centersin the particle position changes, in addition, survival of the fitter particles and elimination of the worse. and keep the good particles out of the poor particles, thereby avoiding the limitations of the optimal solution in the region, while the optimal solution to be global. This can be made by the Institute of hybrid According to the experimental results, the proposed dynamic clustering algorithm, (Hybrid Particle Swarm Optimization and Genetic Algorithm for Dynamic Clustering, (HPSOGADC), can automatically determine the appropriate grouping cluster number, and get a goodbetter clustering quality. According to the experiment shown in this study and the proposed hybrid algorithm than GCUK alone or DCPSO alone. or compare differences in data centers with smaller groups that better clustering quality while the number of clusters determined closer to the correct answer. According to this research model, used in high-dimensional clustering properties of the file set of problems can be more efficient to file a Applying the proposed good algorithm to high-dimensional document clustering problem, the text with similar theme will be effectively gathered together than other document clustering methods. were gathered in groups, It is really can to help users quickly retrieve the necessary and useful information.

並列關鍵字

DCPSO Dynamic Clustering Document Clustering GCUK

參考文獻


[13] 楊燕珠、陳志豐, "基於高頻項目集結合近似樣式匹配之文件分群 Document Clustering Based on Frequent Itemset Integrated with Approximate Pattern Matching," 資訊管理學報, 第十六卷 專刊, Jan. 2009, pp.165-184.
[1] James Kennedy and Russell Eberhart, “Particle Swarm Optimization,” IEEE International Conference on Neural Networks, vol.4, 1995, pp. 1942-1948.
[2] James Kennedy and Russell Eberhart, “A New Optimizer Using Particle Swarm Theory,” IEEE Sixth International Symposium on Micro Machine and Human Science, 1995, pp.39-43.
[4] Yuhui Shi and Russell Eberhart, “Empirical Study of Particle Swarm Optimization,” The 1999 IEEE Congress on Evolutionary Computation, USA, pp. 1945-1950.
[5] John H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, Michigan, 1975.

延伸閱讀