Structure Identification of Artifical Neural Network for Grondwater Simulation Using AIC


本研究之目的在於建立一套方法以供決定模擬地下含水層系統水位之最佳倒傳遞類神經網路結構,此類神經網路可快速評估各種不同抽灌水方案對於地下水位的影響。本研究將倒傳遞類神經網路之隱藏神經元個數的決定,視為一般參數檢定的過程,為了避免傳統以均方差為最佳化結構指標導致參數過多的問題,提出以AIC(Akaike's Information Criterion)指標來決定隱藏神經元個數並優選模式,同時考量了模式誤差、參數維度與觀測數量。本研究將所提出之方法應用於濁水溪流域的雲林地區,利用虛擬的淨補注量輸入已建構完成的地下水流數值模式,以產生地下水位資料,然後訓練並驗證類神經網路。結果顯示,隨著神經元個數的增加AIC指標逐漸遞減至一最小值,之後因過度參數化而往上遞增,顯示以AIC為指標的確能夠找出合理的隱藏層神經元個數,而且隨著訓練資料筆數的增加,所決定之隱藏神經元個數亦隨之增加,證明了本方法較傳統的方法為佳。另外初始加權值與偏權值的給定對本最佳化方法影響很小,採隨機給定即可。研究也發現當訓練資料過多時,類神經網路為了擬和過多資料可能也會導致過度參數化的問題。


The purpose of this paper is to present an optimization algorithm for identification of the best structure of artificial neural network to simulate the groundwater system. The identified ANN can be further applied to evaluate quickly the groundwater level variation caused by any pumping strategy. To identify an ANN model, the number of nodes in every layer must be selected, especially in the hidden layer. In this paper, AIC criterion (Akaike's Information Criterion) is applied to calibrate the ANN model in order to decide the proper number of nodes in the hidden layer. AIC criterion takes model error, dimensions of parameter, and numbers of observation into integrated consideration to avoid over-parameterization.The proposed algorithm is applied to Yunlin in Chou-shui River Basin. The synthetic net recharge data is inputted into a numerical groundwater flow model to produce head observation data. The net recharge and groundwater head data are used to train and validate ANN. As the number of nodes in the hidden layer increases, the AIC criterion decreases in the beginning, reaches its minimum, and then goes larger because of over-parameterization. The results demonstrate the new approach can offer a useful reference and application for the decision of the number of nodes in the hidden layer. The optimized numbers of nodes also increase as the numbers of observation raise. The initial assignment of weight and bias has little influence on optimization. The identified ANN may be over-parameterized to fit too much data.
