透過您的圖書館登入
IP:18.118.129.43
  • 學位論文

隨機森林與決策樹於交通事故嚴重程度分析上之應用

Application of Random Forests and Decision Trees to Severity Analysis of Traffic Accidents

指導教授 : 吳文方
本文將於2029/12/31開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


本研究旨在透過大量歷史資料找尋影響交通事故嚴重程度不同之變量。由於傳統統計方法需要許多前題假設,導致分析結果偏誤,並無法有效辨識前述變量;本研究改採無母數分析方法,透過嚴重程度預測模型之建構,以求找出影響交通事故嚴重程度不同的變量、尤其是關鍵變量。預測模型所使用的演算方法包括「分類與回歸樹」(Classification and Regression Trees, CART)與「隨機森林」(Random Forests),目的在於更理解影響其關聯規則外,並可兩者相互驗證,並獲得可靠穩定的結論。此外,由於許多交通事故資料集有資料類別不平衡的問題,本研究也採用「合成少樣類過採樣技術」(Synthetic Minority Over-sampling Technique, SMOTE)解決該問題。本研究以所收集到2012年至2017年臺北市交通事故資料作為分析案例,經示範計算,顯示本研究所提出之預測模型確實可行;分析結果也發現,影響臺北市該期間交通事故嚴重程度不同最大的變量是車輛類型;而就涉及傷亡之交通事故而言,機車、自行車比大客車、小客車與大貨車具有有較高的風險。

並列摘要


The purpose of this study is to disclose variables that affect traffic accidents by examining a large amount of historical data. Since traditional statistical methods need a lot of pre-hypotheses and would lead to biased results, this study uses a non-parametric method for the analysis and proposes a severity prediction model. In the model, methods of Classification and Regression Tree (CART) and Random Forests are used. The latter can fix the over-fitting problem of the former. Synthetic Minority Over-sampling Technique (SMOTE) is also employed for solving the problem of label imbalance. For case study, traffic accident data of Taipei city from 2012 to 2017 are considered. The applicability of the proposed model is demonstrated through the case study. Moreover, it is found the most critical variable that causes traffic accidents of Taipei city during that period is “vehicle type.” About causality, motorcycles, pedestrians, and bicycles possess higher risks as compared to passenger cars and trucks.

參考文獻


[1] J. Walker, Big Data: A Revolution That Will Transform How We Live, Work, and Think, Taylor & Francis, 2014.
[2] L. Huang, C. Wu, B. Wang and Q. Ouyang, “Big-data-driven safety decision-making: a conceptual framework and its influencing factors,” Safety Science, vol. 109, pp. 46-56, Nov. 2018.
[3] W. A. C. Rojas and C. J. M. Villegas, “Graphical representation and exploratory visualization for decision trees in the KDD process,” Procedia - Social and Behavioral Sciences, vol. 73, pp. 136-144, Feb. 2013.
[4] J. A. Lara, D. Lizcano, M. A. Martínez and J. Pazos, “Data preparation for KDD through automatic reasoning based on description logic,” Information Safety, vol. 44, pp. 54-72, Aug. 2014.
[5] W. Frawley, G. Piatetsky-Shapiro and C. J. Matheus, “Knowledge discovery in databases: an Overview,” Information AI Magazine, vol. 13, no. 3, pp. 213-228, Jul. 1992.

延伸閱讀