相似性分類演算法的應用

在這篇文章中，我們主要是將SCM演算法應用在微陣列(microarray)或稱基因微陣列上，發現SCM演算法具有「過濾比較不具分群資訊的基因向量」的功能，可用來除去資料中對陣列部分不具分群資訊的基因向量，而且原始資料中含有的重要訊息還是保留在我們所篩選過的資料內，並發現將此透過SCM演算法(取最佳Gamma值的 )自我組織後的資料，配合Michael B. Eisen 的Cluster 以及 Tree-View兩個軟體得到的資料分佈顏色圖能直接提供給研究人員觀察出可能的群數，而Michael B. Eisen 的Cluster卻無法提供「過濾比較不具分群資訊的基因向量」以及「清楚由資料分佈顏色圖判斷可能叢集數」的功能。所以我們建議微陣列資料做資料分析前，若能先運用SCM演算法，再使用Michael B. Eisen 的Cluster 以及 Tree-View得到的分析圖，將會提供研究人員更多資訊。我們並將SCM演算法利用Matlab程式語言建立成一個Matlab的子程式，增加其使用的方便性，並利用此子程式比較同一筆資料的兩種情況：沒有經過SCM演算法處理產生的樹狀圖與經由SCM演算法(最佳Gamma值)後再產生的樹狀圖，兩者間顯著的差異性，發現經由SCM演算法(最佳Gamma值)後產生的樹狀圖很明顯讓使用者能清楚判斷出叢集群數。

關鍵字

淋巴瘤；血癌；階層式叢集分析；相似性分類演算法；微陣列

並列摘要

The Similarity-Based Clustering Method (SCM) is applied on Microarray in this thesis. The results demonstrate that SCM has a special function that can dispose of non-cluster gene vectors and still keep the important message in the remaining data. In addition, we combine SCM with two softwares, Michael B. Eisen’s Cluster and Tree-View, so that it can produce the colorful data distribution graph, and can be an easier tool to observe possible clusters for researchers. Overall, we suggest that SCM should be used with Michael B. Eisen’s Cluster and Tree-View to offer better analysis for Microarray data. Besides, a subprogram is developed in Matlab to facilitate the usage of SCM. Two conditions are compared by this subprogram for the same source data. The first one is the tree graph without SCM while the second one is the tree graph with SCM (the best Gamma value). There are significant differences between the two conditions. The tree graph produced with SCM (the best Gamma value) can help the users recognize the clusters more obviously and easily.

並列關鍵字

Leukemia ； Lymphoma ； Microarray ； Similarity-Based Robust Clustering Method (SCM) ； Hierarchical Cluster Analysis

參考文獻

[1] M. S. Yang and K. L. Wu, “A Similarity-Based Robust clustering Method,” IEEE Trans. Pattern analysis and Machine Intelligence, vol. 26, no. 4, pp. 434-448, 2004.

[2] L. A. Zadeh, “Similarity relations and fuzzy orderings,” Information Sciences, vol. 3, pp. 177-200, 1971.

[6] M. B. Eisen, P. T. Spellman, P. O. Brown and D. Botstein, “Cluster analysis and display of genome-wide expression patterns,” Proc. Natl. Acad. Sci. USA, vol. 95,pp. 14863-14868,1998.

[8] J. P. Brunet, P. Tamayo, T. R. Golub and J. P. Mesirov, “Metagenes and molecular pattern discovery using matrix factorization.” Proc. Natl. Acad. Sci. USA, vol. 101, no. 12, pp. 4164-4169, 2004.

[3] 陳健尉。二十一世紀基因分析的利器-基因微陣列之簡介及其應用。生物醫學報導，第二期：18-25頁。2000。

Google Scholar

被引用紀錄

郭怡君（2010）。以相似度為基礎的大學分類研究〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201000606

郭竹晏（2009）。國小學童在數學直覺法則表現之相關與分群探討〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200900668

國際替代計量

相似性分類演算法的應用

未授權

主題瀏覽