透過您的圖書館登入
IP:3.17.79.60
  • 期刊

改良自組映射圖處理種類型及混合型資料

Generalizing Self-Organizing Maps to Handle Categorical and Hybrid Data

摘要


自組映射圖是一種非監督式學習類神經網路,可以將高維度資料投射到低維度空間,並以視覺化方式呈現,反映高維度資料之間的相似度。自組映射圖應用廣泛,包括工程方面及商業方面,例如圖紋辨識、語音辨識、監督處理及流程控制、文件地圖及消費者資料分析等。然而,傳統自組映射圖只能處理數值型資料,種類型資料必須透過編碼轉換成一群二元數值型態資料,因而無法反映種類型資料值之間的相似程度。本研究針對此問題,提出改良式自組映射圖,能直接處理種類型態或混合型態的資料,同時在投射後的低維度空間,反映高維度資料之間的相似度。我們透過人工資料及實際資料實驗,驗證了所提方法的正確性、了解改良式自組映射圖的特性,並探討一個應用至目錄行銷案例。

並列摘要


Self-organizing maps are a kind of unsupervised neural network, which project high-dimensional data to lower dimensions and, at the meantime, visually uncover the similarity among the original high-dimensional data. Self-organizing maps have been successfully applied to many fields including engineering applications and business applications, such as texture identification, speech recognition, process monitoring and control, document maps, and consumers' data analysis. However, conventional SOMs handle only numerical data, categorical data has to be converted to Boolean data resulting in unable to disclosure the similarity among the high-dimensional data. This paper propose a refined self-organizing map that can directly handle categorical data or hybrid data, map the data to lower dimensions, and also uncover the similarity among data. In the experiments, artificial data and real data are used to demonstrate the correctness of the proposed model, and gain insights of the refined self-organizing maps.

參考文獻


Chen, D. R.,R. F. Chang,Y. L. Huang(2000).Breast cancer diagnosis using self-organizing map for sonography.Ultrasound in Medicine and Biology.1(26),405-411.
Deboeck, G. J.(2000).Modeling non-linear market dynamics for intra-day trading.Neural-Network-World.1(10),3-27.
Han, J.,Kamber, M.(2001).Data mining: concepts and techniques.Morgan Kaufmann Publishers.
Han, J.,Y. Cai,N. Cercone(1993).Data-driven discovery of quantitative rules in relational databases.IEEE Transaction Knowledge and Data Engineering.1(5),29-40.
Huang, Z.(1998).Extensions to the k-means algorithm for clustering large data sets with categorical values.Data Mining and Knowledge Discovery.2,283-304.

延伸閱讀