透過您的圖書館登入
IP:18.217.203.172
  • 學位論文

探討自組特徵映射網路有效之評估指標-以區域淹水分類為例

Exploring the Effective Evaluation Indices of Self-Organizing Map for Clustering Regional Flood Inundation Map

指導教授 : 張麗秋

摘要


現今人工智慧已成為最蓬勃發展的議題,不論各領域之研究議題與實務應用;在水資源研究領域,有關AI應用於水資源管理或淹水預報已經成為相當重要議題之一。本研究提出如何自動建置自組特徵映射網路(SOM)於淹水空間分布關係分類之方法論,建置SOM模式時,判斷其收斂性與最佳模式選擇之常見的三大問題:1.拓樸層神經元次序錯誤,造成拓樸錯誤;2.選擇次序與收斂兩訓練階段迭代次數;3.決定拓樸層最佳拓樸大小。本研究提出兩種訓練演算策略SOM模式之方案,以臺南市鹿耳門溪、鹽水溪、馬來西亞甘馬挽縣三個研究區域作為研究案例,以不同淹水模式產出各自淹水歷程資訊作為訓練資料,進行探討SOM模式之收斂情況。方案一:以次序階段進行訓練至神經元權重無明顯改變,進入收斂階段並訓練至神經元權重無明顯改變(涵蓋率變化量小於5%)後停止;方案二:以次序階段進行訓練,使初始混亂權重逐漸展開,至涵蓋率50%進入收斂階段加強資料特性掌握,訓練至權重無明顯改變(涵蓋率變化量小於5%)後停止;由此兩方案探討SOM次序階段與收斂階段對於拓樸收斂性之影響,並透過累積率曲線圖涵蓋率、平均權重拓樸圖與五種指標比較SOM模式分類結果。涵蓋率之定義為最大與最小神經元之累積率差值;由平均權重拓樸圖之違反正確鄰近關係方向且差距大於5%定義為翻轉發生。結果顯示兩種訓練方案累積率分布圖中隨著迭代次數增加涵蓋率皆也隨之增加,代表訓練過程中為有效提升對於訓練資料之掌握度,透過平均權重拓樸圖確認其神經元間是否為正確鄰近關係之表現,結果發現方案一鹿耳門溪與馬來西亞甘馬挽縣均有翻轉情況產生,而方案二均無翻轉發生,因此,SOM之訓練策略較適用使用方案二;再以方案二訓練方法進行不同拓樸大小之訓練並進行比較,結果發現其3×3模式相對於4×4、5×5模式其涵蓋率都小約5%至10%,對於資料掌握度是不夠完整,而4×4與5×5雖然涵蓋率相差甚少,但透過累積率分布圖計算其神經元標準差能發現5×5均相較於4×4來的小,代表其神經元分布過於集中出現過度描述情況,因此,4×4決定為最佳拓樸大小。本研究也發現指標PC、XB、DBI應用於SOM淹水模式能清楚地表現訓練過程為明顯且有效分類。

並列摘要


Today, Artificial Intelligence is one of popular issues with many research topics and practical applications. The relative AI issues on the study of water resource management or flood forecast have become one of important topics. The purpose of this study is to propose the methodology to automatically build the Self-organizing maps (SOM) on clustering the flood spatial distribution. There are three major problems on building the SOM model; first one is the topological error, that is, any two neurons flip each other weights that makes the order of the topological map; second one is to the selection of the number of epochs. The training algorithm of SOM has two phases, ordering phase and convergent phase. Hence, these two phases have the different number of epochs and the number of epochs can influence the convergence; third one is to decide the optimal size. This study proposes two training strategies of the SOM models and takes Luermen Creek and Yenshui Creek located in Tainan, and Kemaman River located in Terengganu of Malaysia to investigate the convergence of the SOM models. The first strategy, called plan1, is to train the network in the ordering phase until the weights of the neurons have no obvious change, then transfer to the convergent phase and continue training the neurons until the weights have no obvious change. The second strategy, called plan2, is to rain the network in the ordering phase until the coverage rate of weights reaches 50%, then transfer to the convergent phase and continue training the same as the convergent phase of plan1. We use the flood simulation data of these three areas as the training data to build their own models. Through the different training strategy of plan1 and plan2, we can explore the influences of the ordering and convergent phases on building the SOM models. Through coverage rate, flip detector and five indices to compare the clustering results of the SOM clustering results. The coverage rate is defined as the difference of the cumulative distribution rates between maximum and minimum weights (neurons). The flip detector can check whether any two or more neurons flip each other weights or not and determine topological order correct or not. The clustering results of these three cases show that the number of epochs can influence the coverage rate and effectively improve the clustering quality. The larger number of epochs can get the larger coverage rate. The results show that plan2 can get convergent clustering results while plan1 occurs flip in Luermen Creek and Kemaman River. Hence plan2 is more suitable than plan1 for applying the SOM model on clustering the flood spatial distribution. Moreover, for comparison of the different size of the SOM models, the results demonstrate that the coverage rates of 3×3 model are smaller than those of 4×4 and 5×5 models, about 5%-10% less. That means 3×3 model cannot describe the characteristics of data as well as 4×4 and 5×5 models. The coverage rates of 4×4 and 5×5 models are almost the same, so the small models should be enough neurons to describe the data, that is, 4×4 is an appropriate size than other models. Hence, for choosing the size of topology map, the coverage rate is the great index to decide the optimal size.

參考文獻


1. Chang, F. J., Chang, L. C., Kao, H. S., Wu, G. R.(2010). Assessing the Effort of Meteorological Variables for Evaporation Estimation by Self-Organizing Map Neural Network. Journal of Hydrology, 384 (1):118–29.
2. Chang, F. J., Tsai, W. P., Chen, H. K., Yam, R. S. W., Herricks, E. E.(2013).A Self-Organizing Radial Basis Network for Estimating Riverine Fish Diversity. Journal of Hydrology 476 (Supplement C):280–89.
3. Chang, L. C., Shen, H. Y., Chang, F. J.(2014). Regional Flood Inundation Nowcast Using Hybrid SOM and Dynamic Neural Networks. Journal of Hydrology 519 (Part A):476–89.
4. Chang, F. J., Chang, L. C., Huang, C. W., Kao, I. F.(2016). Prediction of Monthly Regional Groundwater Levels through Hybrid Soft-Computing Techniques. Journal of Hydrology 541 (Part B):965–76.
5. Chen, I. T., Chang, L. C., Chang, F. J.(2018). Exploring the Spatio-Temporal Interrelation between Groundwater and Surface Water by Using the Self-Organizing Maps. Journal of Hydrology 556 (Supplement C):131–42.

延伸閱讀