支持向量機(Support Vector Machine, SVM)屬於資訊科學領域中的機器學習(Machine learning),為一種監督式學習(Supervised Learning)演算法,其分類、回歸的功能亦可應用於地質、環境科學等相關領域。本研究將農地重金屬含量調查資料透過內梅羅指標(Nemerow index, PN)轉換後,以 SVM 搭配地理資訊系統(geographic information system, GIS)劃分土壤農地重金屬高污染潛勢區,過程中透過10次交叉驗證優選訓練集標籤組成比例、樣本數量。結果顯示,彰化縣以7,353筆點位在陽性(PN≧1.0)、陰性(PN<1.0)標籤比1:2下建立之模型進行土壤重金屬污染潛勢預測,結果準確度(Accuracy)為85.37%、F1-measure為0.692;桃園市在標籤比為1:1下,共3,288筆資料模型,污染潛勢預測之結果準確度為 71.58 %、F1-measure為0.506。並將結果套疊河川流域、工廠、工業區等空間分布資訊,評析以SVM劃分農地重金屬污染潛勢區域之肇因及關連性,證實 SVM 演算法能有效地應用於土壤重金屬污染潛勢劃分,且在低訓練集樣本數即可達良好的分類效能。
Support Vector Machine (SVM) is a kind of supervised learning algorithm of machine learning in computer science, it’s function such as classification and regression could also be applied to related field e.g. geoscience and environmental science. In this research, the data of heavy metal pollution areas in agricultural land converted by Nemerow index (PN) combined with SVM and geographic information system (GIS) classifies the highly potential heavy metal pollution areas in agricultural land. For modeling, the samples were optimized into an ideal proportion for training data set by using 10-fold cross validation. In Changhua County, at 7,353 points with the sample labeled ratio of positive (PN≧1.0) and negative (PN<1.0) set to 0.5, results show the potential heavy metal pollution area with an accuracy of 85.37% and F1-measure of 0.692; In Taoyuan city, at 3,288 points with sample labeled ratio set to 1, results show the potential heavy metal pollution area with accuracy of 71.58% and F1-measure of 0.506. By interpreting the mapping of results with the information of surrounding geological features such as the distribution of river basins, factories and industrial zones, it allowed us to divide the causes and relationships of potential heavy metal polluted area with the use of SVM. Thus, the algorithm had proved that it could be validly applied to classify the potential heavy metal pollution areas in agricultural land even with low training data set.