透過您的圖書館登入
IP:18.220.16.184
  • 學位論文

Data Visualization by Self-Organizing Map

以SOM做資料視覺化

指導教授 : 陳朝欽
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


Data visualization is very paramount nowadays for the simple fact that we have acquired huge and complex data and are increasingly accumulating more and more due to cheap storage devices recently. Most of those data are high-dimensional and therefore hard for human to visualize. Efforts were made to alleviate this high-dimensional visualization problem and through researchers endeavor, Self-Organizing Map (SOM) was born. The Self-Organizing Map is an unsupervised neural network algorithm that projects high-dimensional data onto a two-dimensional map which we human can easily visualize. The projection preserves the topology of the data so that similar data items will be mapped to nearby locations on the map. It is a powerful method for data mining and cluster extraction and very useful for processing data of high dimensionality and complexity. There are several visualization methods which present different aspects of the information learned by the SOM to gain insight and guide segmentation of the data. In this thesis, common visualization methods such as dendrogram, 2d-dendrogram, principal component projection, label of maps, U-matrix and some recently introduced methods such as P-matrix and the U*-Matrix plots are used to visualize the results on four data sets: IRIS which has 150 patterns with 3 classes, each class has 50 patterns, each pattern has four features; 8OX has 45 patterns with 3 classes, each class has 15 patterns, each pattern has 8 features; A microarray data set ALL-AML Leukemia with 38 patients of 2 classes (27 ALL, 11 AML), each patient has 7129 genes; and Colon Tumor with 62 samples (22 normal, 40 tumor) of 2 classes with a total of 2000 genes. The visualization results of each of these data sets are reported using the aforementioned methods, the 2d-dendrogram method seems to be a better tool for visualizing the microarray data and all the methods perform well on the IRIS and 8OX.

關鍵字

SOM 資料視覺化

並列摘要


無資料

並列關鍵字

SOM Data Visualization

參考文獻


Proceedings of National Academy of Sciences of the United States of America, vol.
96, 6745-6750, 1999.
Blackwell, Cluster Analysis in Marketing Research, 1994.
[Bald2002] P. Baldi and G. Hatfield, “DNA microarrays and gene expression”, Cambridge
relevance in feature selection for microarray data”, IEEE Intelligent

延伸閱讀