監督式機器學習方法實務應用於坡地災害的潛勢預測,與傳統破壞力學或現地監測的預測方法主要有幾項差異,數據驅動的演算法技術近年來蓬勃發展,準確率已大幅提升,不僅運算速度快,不同類別及來源資料可以交叉匯總分析,而且模擬結果可以使用視覺化呈現、易於理解。本研究的數據來源為臺灣歷年事件型崩塌目錄,涵括2006年至2021年共15年之衛星照片崩塌地判釋資料庫,從中挑選高屏溪流域657筆崩塌地資料,結合地理空間資訊系統(QGIS)抽樣的1155筆非崩塌地樣本,作為訓練監督式機器學習分類器的訓練資料集,進行四種分類器(羅吉斯迴歸法、近鄰演算法、支撐向量機、隨機森林演算法)參數設定篩選。演算法輸入特徵因子有下列五種:坡度、坡向、崩塌率分級、臨河川距離、單一事件(颱風或豪雨)3日累積雨量,模型訓練時分別挑選上述三項因子、四項因子、五項因子,加上各個分類器的多組參數進行搜尋,並以混淆矩陣的準確率、精確率、召回率、F1分數、AUC曲線下面積、PR曲線AP值等六項指標,來建立及挑選合適於台灣崩塌潛勢預測的機器學習分類器。支撐向量機(非線性核函數rbf,gamma=0.001,C=1000)時為本研究測試所有測試組合中的最佳組合,10-fold交叉驗證準確率平均值98.9%,驗證資料集準確率97.58%、精確率100.00%、召回率75.00%、F1分數85.71%、AUC曲線下面積0.88、PR曲線AP值0.77。次佳的為隨機森林演算法。但建議後續研究改善模型泛化能力,必須加入少量跨區域崩塌樣本,或者改用關聯式支撐向量機(RVM),也可嘗試使用Tensorflow訓練深度學習模型。
This case study presented the result of supervised machine learning using the landslide event inventories catalog in Kaoping river basin, Taiwan (ROC) from 2006 to 2021. For gathering the training datasets, four of those landslide events induced by typhoons and the other four rainfall-induced landslide events were collected (2006-2015). Slope, aspect, landslide ratio, nearest river distance, and three-day rainfall accumulation were also considered as major factor of features. The model candidates were training by logistic regression classifier (LR), nearest neighbor classifier (KNN), support vector machine (SVM), and random forest algorism (RF) using different setting of parameters and comparing to their accuracy, precision, recall rate and f1-score which were calculated from confusion matrix over through 10-fold testing. Area under ROC curve and PR curve were also be evaluated. The SVM model with rbf kernel, gamma=0.001, C=1000 leads to the highest 10-fold average accuracy (98.9%), and the best result on the testing datasets (accuracy=97.58%, precision=100.00%, recall-rate=75.00%, f1-score=85.71%).