透過您的圖書館登入
IP:3.144.88.217
  • 學位論文

整合地理統計與機器學習方法於水文地質架構推估與模擬-以蘭陽平原為例

Integrating Geostatistic and Machine Learning for Hydrogeological Constructure Estimation and Simulation: A Case Study in Yilan Plain

指導教授 : 余化龍

摘要


地下水相關研究中,由於水文地質架構如岩性場、水力傳導系數(K)場會影響地下水流況,因此,若能對於水文地質架構有一定程度的掌握,有益於研究的發展。然而,因自然條件限制、經費不足等原因,無法在研究區域建造大量的觀測井,造成水文地質相關資料較少。因此,需要利用空間推估或模擬方法增加水文地質資料在空間中的解析度,以利掌握研究區域的水文地質架構。 對於自然環境資料的空間推估,傳統上常使用地理統計(如克利金法)將已知資料推估至未知點,提升研究區域的資料數量,然而,過往的地理統計方法須建立在遍歷性假設上,該假設在自然環境中不易證實,且只能利用確定性資料進行空間推估。因此,本研究嘗試以貝氏最大熵法(BME)推估水文地質架構,以類別型BME推估三維岩性場,連續型BME推估K場。BME為新興地理統計方法,該方法不須依賴過往地理統計中的假設,且最大的優點在於可以融合不確定性資料輔助空間推估,使更多種資料可以融入模型中,提升推估效果。因此,本研究會透過機率支援向量機法建立初步的三維岩性機率場作為不確定性資料,輔助類別型BME推估岩性場;透過地球物理資料與岩性資訊建立線性K值模型並產生不確定性K值,輔助連續型BME推估K場。 此外,本研究也會嘗試模擬三維岩性場。過往的研究大多利用多點地理統計模擬岩性場,該方法僅能以圖片作為訓練資料,然而實際可取得的岩性資料大多為數據的形式,不易建立圖片,且該方法每一次模擬過程只能產生一個模擬結果。因此,本研究嘗試利用條件對抗式生成網路(CGAN)模擬蘭陽平原三維岩性場,CGAN可以利用數據做為訓練資料且能建立模擬模型,產生大量的模擬結果,這些模擬結果建立的岩性場與訓練資料有相似的空間統計特性。 MODFLOW為目前被廣泛使用的地下水模式之一,其建模過程需將研究區域中各層含、阻水層邊界輸入模型中。不過,由於真實世界的岩性分布較為複雜,不易進行分層,過往研究大多粗略的認定含、阻水層邊界。因此,本研究嘗試以資料科學的觀點,較為細緻的將蘭陽平原進行分層,共分出四個含水層、三個阻水層。 本研究整合地理統計、機器學習方法建立水文地質架構,未來可將本研究的方法進行調整,應用於其他地下水流域,期盼對於地下水的研究有所貢獻。

並列摘要


In groundwater applications and studies, hydrogeological structure such as lithological field and hydraulic conductivity filed plays a important role in contaminant transport and groundwater flow. Therefore, it’s important to understand the hydrogeological structure of study area. However, because of some restrictions in reality, only limited amount of hydrogeological data can be acquired. Thence, it is necessary to use spatial estimation or simulation methods to increase hydrogeological data in study area. For spatial estimation of environment data, previous studies often used geostatistics methods such as Kriging to estimate the hydrogeological structure on unsampled points. However, those methods must be based on ergodic assumption, which is difficult to verify in the reality. Therefore, this study applied Bayesian maximum entropy (BME) method to estimate the hydrogeological structure. The three-dimensional lithological field was estimated by the categorical BME, and the hydraulic conductivity field was estimated by the continuous BME. The BME method is an emerging method of the geological spatiotemporal statistics field. This method does not need to rely on the assumptions of past geostatistics method. The best part of the BME method is that it can integrate soft data to assist spatial estimation, so that more data can be incorporated into the model and improve the estimation performance. Therefore, in this study, the preliminary three-dimensional lithologic probability field was established by the probability Support Vector Machine(pSVM) as the soft data to assist the estimation of the lithologic field; soft data of hydraulic conductivity was established through the geophysical data and lithologic data, assisted continuous BME to estimate hydraulic conductivity field. Furthermore, this study also attempted to simulate the three-dimensional lithological field. Most of the previous studies used multiple-point geostatistics to simulate lithological fields. This method can only use image as training data. However, most of the lithological data that can actually be obtained are not in the form of image. In addition, this method can only produce one simulation result per simulation process.Therefore, this study applied the Conditional Generative Adversarial Network (CGAN) to simulate the three-dimensional lithological field of Yilan Plain. CGAN can use not only image but data as training data and can establish the simulation model to generate a large number of simulation results. MODFLOW is one of the most widely used groundwater models, and it’s modeling process requires to input the boundaries of each aquitars and aquifers. However, due to the complex distribution of lithology in reality, most of the previous studies roughly identified the boundaries of aquitars and aquifers. Therefore, this study attempted to divide the Yilan Plain into four aquifers and three aquifers in a more detailed way from the viewpoint of data science. This study integrated geostatistics and machine learning methods to establish hydrogeological structure. We look forward to applying the method of this study to other groundwater areas.

參考文獻


Bogaert, P. (2002). Spatial prediction of categorical variables: the Bayesian maximum entropy approach. Stochastic environmental research and risk assessment, 16(6), 425-448.
Bogaert, P., D'Or, D. (2002). Estimating soil properties from thematic soil maps. Soil Science Society of America Journal, 66(5), 1492-1500.
Barghout, L. (2015). Spatial-Taxon Information Granules as Used in Iterative Fuzzy-Decision-Making for Image Segmentation. In Granular Computing and Decision-Making (pp 285-318): Springer, Cham.
Cai, Y. D., Liu, X. J., Xu, X. b., Zhou G. P. (2001). Support Vector Machines for predicting protein structural class. BMC Bioinformatics 2, 3
Cai, C. Z., Han, L. Y., Ji, Z. L., Chen, X., Chen, Y. Z. (2003). SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Research, 31(13), 3692–3697.

延伸閱讀