透過您的圖書館登入
IP:18.225.95.248
  • 學位論文

以監督式學習探討北捷旅運量

A Study of the Taipei MRT Ridership by Supervised Learning Methods

指導教授 : 王曉玫

摘要


旅客運量預測是興建及規劃運輸系統一項重要因素,也是對各項的運輸設備之新建和擴展的需求計畫之基本依據。以往大多數預測旅運量的研究多選取北捷月或年總運量作為預測資料。但需要注意的是,很少研究嘗試建立在所有捷運線之日旅運量。為進行更精細的預測旅運量預測,本研究使用2015年8月1日至2019年7月31日期間台北捷運108個捷運站之日旅運量,總計154,318筆資料,探討捷運線、月份、景點、雨量、紫外線、氣溫、以及日期狀態(平、假日)等變項對出站旅客運量之影響。 本研究目的在探討以監督式學習中三種演算法(複迴歸分析、CART迴歸樹、以及Cubist 迴歸樹)在北捷旅運量預測模式的運用,以四種衡量指標:判定係數(R2)、調整後判定係數(Adj- R2)、最小最大準確率(Min Max Accuracy)、以及絕對平均百分誤差(MAPE)作為演算法之評估指標。結果顯示,複迴歸模型僅在解釋變異上表現最佳(R2及Adj- R2均為99.39%),此乃因複迴歸模型在預測中之解釋變項高達10個以上,不符實際上使用的效益;因此,建議以Cubist迴歸樹演算法(Min Max Accuracy為95.5%及MAPE為4.7 %)作為推估旅運量,Cubist迴歸樹之片段式迴歸式中景點(100%)、捷運線(98%)、以及平假日(79~82%)為主要影響旅運量之三個變項。

並列摘要


The forecast of ridership is an important factor in the construction and project of the transportation system. It is the fundamental basic for building and expanding plan of various transportation equipments. In recent years, the majority of researches in the prediction of MRT ridership has focused on monthly datasets or annual ridership. It should be noted, however, that there have been few attempts to establish a direct relationship on daily ridership of whole MRT lines. For the specific predicted ridership, the proposed prediction model has been evaluated in total of 154,318 observations which covers totally 108 stations and 6 lines during the period from Aug. 1, 2015 to July 31, 2019. The influenced factors studied here including MRT lines, month, sight, rain, UVI, temperature, and status (working day, holiday) for passengers. In this research we perform a comparison of three supervised learning methods (multiple linear regression, CART regression tree, and Cubist regression models) to estimate ridership of MRT. Coefficient of determinant (R2), adjusted coefficient of determinant (Adj-R2), Min Max Accuracy, and mean absolute percentage error (MAPE) are used as four of the measurements. The results show that multiple regression model only performs best in R2 and Adj- R2 (99.39%, 99.39%) due to more than 10 explanatory variables in the prediction, which does not match the effectiveness. Therefore, Cubist regression method is suggested in the present study to predict ridership of MRT. In addition, sight (100%), MRTLine (98%), and status of working or holidays (79~82%) are three major influential variables with the piecewise regression of Cubist regression model.

參考文獻


吳明隆、張毓仁(2017)。R軟體在決策樹的實務應用。 台北市: 五南。
李文勳(2020)。 天氣預測對捷運搭乘者影響之研究。台灣師範大學文學院地理學研究所未出版之碩士論文。
杜強、賈麗艷(2012)。SPSS統計分析完全學習手冊。台北市: 佳魁資訊。
林君宜(2015)。運用M5’模式樹分析放款發生逾期因子。成功大學高階管理研究所未出版之碩士論文。
林楨家、黃至豪(2003)。台北捷運營運前後沿線房地屬性特徵價格之變化.。運輸計畫季刊,32(4),777-800。

延伸閱讀