透過您的圖書館登入
IP:18.222.118.14
  • 學位論文

應用隨機森林與約略集合理論於道路交通事故資料分析之研究–以新北市自行車事故資料為例

Applying Random Forest and Rough Set Approach for Analyzing Road Traffic Accident Data – Using New Taipei City Cyclist Accident Data as Case Study

指導教授 : 許超澤

摘要


隨著取得研究數據的方法更加多元與完善,為提升資料的可操作性,歐洲於2016年起根據FAIR原則進行研究數據治理,原則包含數據須具備可尋找性(Findable)、可取得性(Accessible)、互操作性(Interoperable)與可重複使用性(Reusable)。因此,若有完善的數據治理將有助於不同單位的數據整合與共享,且透過鏈結不同單位的數據,也可提升資料應用之價值。我國現行的道路交通安全相關分析研究與肇事鑑定主要使用警政署的事故資料,但警政署的事故資料並未與其他單位的資料鏈結,當事故鑑定會判定之肇事原因與警政署不同時,在警政署的事故資料中仍紀錄警察初步判定之肇事原因。由於事故資料內容的正確性與完整度會影響後續交通安全分析之結果,因此本研究透過鏈結警政署與事故鑑定會的事故資料,使交通事故資料內容更為完善,也有助於進一步釐清交通事故之問題。 自行車為各國推行綠色運輸的代表運具,但自行車騎士為脆弱的道路使用者,其事故風險遠高於其他運具。我國近4年(民國105年至108年)的自行車涉入事故件數跟死亡人數逐年增加,平均每年有9,000餘件自行車涉入事故、100餘人死亡,顯見自行車安全不容忽視。本研究使用民國100年至108年新北市車輛行車事故鑑定委員會之自行車事故紙本資料(包含警政署提供給事故鑑定會的事故資料)進行分析。有別於其他研究多以敘述性統計與傳統統計理論模型探討事故受傷嚴重性的關鍵影響因素,本研究使用約略集合理論來分析具不確定性與不完全性的事故資料,以探討事故因素跟受傷嚴重性間的因果關係。在建立模型前,本研究首先將自行車事故紙本資料進行編碼,編碼方式主要沿用警政署的事故欄位,並根據研究需要有另新增編碼欄位與細分現行欄位。接著進行資料清理,並將警政署與事故鑑定會的事故資料進行鏈結。接著將鏈結後的事故資料依照自行車當事人的肇事肇任歸屬分成三個子資料,分別為完全有責事故、部份有責事故與無責事故。為避免在建立約略集合理論模型時產生過多的無效決策規則,本研究先透過隨機森林模型進行變數重要度排序,以篩選重要變數。再透過約略集合理論產生的決策規則探討自行車當事人常見的錯誤用路行為(根據完全有責與部份有責)與應預防的對方車錯誤用路行為(根據無責事故)。 研究結果顯示,在完全有責事故與部份有責事故之決策規則中,自行車騎士常有轉向不當、侵犯他車路權、路外起駛未禮讓正線車及逆向等違規行為。根據無責事故之決策規則結果顯示,對方車輛在路段超車時常未與自行車保持安全間距,以及在右轉彎時未禮讓直行自行車。與機動車輛相比,自行車的相關法令規範與安全宣導較為不足,部份宣導內容也忽略自行車常見的安全問題。為了提升自行車安全,建議短期可透過針對特定違規行為進行嚴格執法與重新檢視自行車安全宣導內容,並且配合現況作內容調整;中長期則透過培養自行車安全教育宣導團隊,以及完善自行車相關法規與安全管理。

並列摘要


As the methods for data collection become more diversified. European countries following the FAIR principles to manage data in 2016. FAIR principles mean data should be Findable, Accessible, Interoperable, and Reusable. Therefore, a comprehensive data management can help different agencies to integrate and share those data. Through connecting data from different agencies can also enhance the value of application. Current analysis of road traffic accidents and traffic accident investigation mainly used the traffic accident dataset proposed by National Police Agency (NPA). However, the NPA dataset does not link with other agencies. When the cause of accident determination by the Traffic Accident Investigation Committee is different from NPA, the cause of the accident in the NPA dataset will not be updated. Because the accuracy and completeness of the traffic accident data will affect the results of traffic accident analysis, this study linked the NPA traffic accident dataset and the dataset from the Traffic Accident Investigation Committee of New Taipei City to improve the completeness of dataset and therefore to clarify the problem of traffic accidents. Bicycle can represent green transportation in many countries but cyclists are vulnerable road users, and those who are involved in a crash are exposed to a much higher risk of injury compared to motor vehicle users. Taiwan had an average of more than 100 deaths in 9,000 cyclists-involved accidents each year from 2016 to 2019. This shows that cyclists' safety cannot be ignored and urgently needed improvement. This study used the cyclist accident investigation report data (including the NPA traffic accident investigation report) from 2011 to 2019 by the Traffic Accident Investigation Committee of New Taipei City. This study is Different from other studies that often use descriptive Statistics and traditional statistical models to explore the important factors that affect the injury severity of traffic accidents. The rough set approach can be used to analyze traffic accident data with uncertainty and incompleteness. This study is exploring the relationships among factors and accidents by using Rough Set Approach. First, this study code the cyclist accident investigation report data mainly follows the accident investigation codebook from NPA before modeling. And because of research needs, we have added new encoded fields and subcategories of the current encoded fields. Second, we cleaned the dataset and linked the NPA traffic accident dataset with the dataset from the Traffic Accident Investigation Committee of New Taipei City (hereinafter referred to as cyclist accident dataset). Third, we spilt the cyclist accident dataset to three sub-dataset according to the degree of responsibility for the accident that cyclists have to bear, which are all-at-fault, partial-at-fault and not-at-fault. Fourth, we used the Random Forest model to filter the important variables before modeling rough set approach in order to avoid many invalid decision rules producted by rough set approach model. Fifth, we used decision rules to explore cyclist's risk-driving behavior on roads (accroding to all-at-fualt and partial-at-fault) and other vehicle's risk-driving behavior that cyclist should prevent on road (accroding to not-at-fault). The study result shows that cyclist often has improper turning, failing to yield the right of way and going the wrong way…etc. Furthermore, vehicles often don't keep a safe lateral distance from cyclists and failed to yield the right of way of the straight cyclist when right turn. Compared with other vehicles, cyclist-related laws, regulations are insufficient. Also, cyclist safety advocacy content also ignores the risk-driving behavior of cyclists. To enhance cyclist safety, the authority concerned can enforce the law against specific violations and update the cyclist safety advocacy content with the current condition in the short-term. In the medium and long-term, the authority concerned can train the cyclist safety advocacy teams, and improve cyclist-related laws, regulations, and safety management.

參考文獻


1. Aldred, R., García-Herrero, S., Anaya, E., Herrera, S., Mariscal, M. Á. J. I. j. o. e. r., & health, p. (2020). Cyclist injury severity in Spain: a Bayesian analysis of police road injury data focusing on involved vehicles and route environment. International journal of environmental research and public health, 17(1), 96. doi: 10.3390/ijerph17010096
2. Augeri, M. G., Colombrita, R., Greco, S., & Sapienza, P. (2014). Dominance-based rough set approach to network bridge management. The Baltic Journal of Road and Bridge Engineering, 9(1), 31-42. doi: 10.3846/bjrbe.2014.05
3. Augeri, M. G., Cozzo, P., & Greco, S. (2015). Dominance-based rough set approach: An application case study for setting speed limits for vehicles in speed controlled zones. Knowledge-Based Systems, 89, 288-300. doi: 10.1016/j.knosys.2015.07.010
4. Bédard, M., Guyatt, G. H., Stones, M. J., & Hirdes, J. P. (2002). The independent contribution of driver, crash, and vehicle characteristics to driver fatalities. Accident Analysis & Prevention, 34(6), 717-727. doi: 10.1016/S0001-4575(01)00072-0
5. Bíl, M., Bílová, M., & Müller, I. (2010). Critical factors in fatal collisions of adult cyclists with automobiles. Accident Analysis & Prevention, 42(6), 1632-1636. doi: 10.1016/j.aap.2010.04.001

延伸閱讀