由於公路客運在行進的過程中常發生突發狀況,因而影響旅行時間預估的準確性。因此傳統的公車站牌所能提供的班次時刻表並不能夠滿足搭乘者的需求。為了解決上述問題,本研究的旅行時間預估模型將考量時段、車輛種類、駕駛行為、站牌之間距離等因素來設計,以提高旅行時間預估的準確性。 本研究的資料蒐集是使用裝設在公路客運車輛上的GPS設備,將所行經的各站站牌的到離站紀錄回傳到伺服器上。資料蒐集的時間範圍為2012年三月至2012年四月,行進路線的地區為新竹縣,起點與終點為竹東至新竹。三月份資料有192,757筆,四月份資料有190,716筆,總計383,473筆。 本研究將三月份資料做為訓練資料,使用聚類演算法將各站的行駛時間和停等時間進行聚類分析。聚類的結果與車輛、駕駛、星期、開班時段、到離站時間等資料一起,利用決策樹法建立旅行時間預估的決策樹模型。本研究實驗了四種分類方式:分四類(快、中偏快、中偏慢、慢)、分三類(快、中、慢)、分二類(快、慢)、分一類(中、亦即不分類),分別建立了四種決策樹模型,然後將四月份資料做為測試資料來進行模型的驗證,使用百分比誤差之標準差來評估各個模型的準確率。 透過上述實驗的方法,將所有站牌的所有開班時段的旅行時間建成四種模型,並計算百分比誤差之標準差,從計算結果中取出四種分類法中較佳的模型,並計算其占有率。實驗結果顯示,分一類至分四類的最佳模型分別占:23.97 %、43.83%、16.99%、15.21%,由此可知大多數以分二類為最佳模型。最後將各站牌的最佳分類模型的預估時間進行加總,其結果作為本研究的旅行時間預估。
The travel times of public buses are variant due to different traffic situations, as a result, the schedules listed on the bus stop are seldom accurate and can not fulfill the need of the passengers. In this research, we build a travel time estimation model based on hours, car types, driving behaviors, distances between bus stops, to improve the accuracy of travel time estimation. The data used in this research are collected by GPS probing buses which send their traveling records back to the server. The data are collected from March to April, 2012 in the area of Hsin-Chu County, and the route is from Zhu-Dong to Hsin-Chu. The total number of records is 383,473 with 192,757 records in March and 190,716 records in April. We used the records in March as the training data set and clustered the travel time and waiting time into several clusters. We used the clustering results, together with car types, drivers, weekdays, hours, arrival time, depart time, to build decision tree models, according to four different cluster numbers: 4 clusters (fast, medium fast, medium slow, slow), 3 clusters (fast, medium, slow), 2 clusters (fast, slow), and 1 cluster (medium, i.e. no clustering is performed). We used the records in April as the testing data set to test the models, and applied the standard deviation of percentage error method to evaluate the accuracies of the models. By applying the above mentioned methods, the SDPE of the 4 models of each stops and each hours are calculated and we choose the best models and calculate their percentage. The experiment results indicate that, after comparing the percentage of the estimations which is under 20% SDPE of each models, the percentages of best models of 1-cluster to 4-cluster are 23.97%, 43.83%, 16.99%, 15.21% respectively. It shows that 2-cluster model is the best one, mostly. In conclusion, we choose these best models as the recommended travel time estimation method.