應用支援向量迴歸於交通資料遺失值之插補：以固定式車輛偵測器資料為例

道路管理者或規劃者常依據交通即時資料，作為瞭解道路的車流狀況的基礎，並得依此擬定因應管理策略與提供用路人路況資訊。但交通資料可能因蒐集、傳輸或處理過程可能發生異常或遺失的狀態，而影響道路管理者對路網的判斷，因此處理交通資料的遺失為值得重視的課題。目前國內外提出處理交通資料遺失值的插補方法甚為多元，但在實務上並未有共識的最佳方法，因為都有其運用上的限制，例如方法理論上模式假設與限制，演算過程較耗時或困難；或需利用黑盒子形式軟體為輔助工具，較無法理解內部運作關係，導致操作較容易受限；或需要大量歷史資料及模式識別方法建立專用模型以提高插補績效；此外，研究結果較少針對模式參數穩定性或泛用性進行探討，而實證範圍通常較侷限特定區域。　　支援向量迴歸(Support Vector Regression, SVR)係源自於支援向量機的機器學習方法，早期運用於模式識別領域，其具有容許誤差、不需過多假設，且透過核函數(Kernel Function)處理在高維空間非線性轉換問題的特點。相關研究透過此方法處理交通領域的問題，如交通量預測、旅行時間預測等，證實有良好的成果。因此本研究目的係構建以SVR為基礎的快速反應插補模型，並檢視其特性，包括預測準確性、模式參數是否達到穩定及泛用，以及操作上是否容易，追求建立一套可以簡易修正的基礎模式參數，並可調整因應適用於不同的情境狀態問題。本研究有別以往研究模式構建直接以原始時序資料投入模式中，模式構建係以基本鄰近上、下游資料作為錨定值及上下游資料間相對變動量(差分值)之核心差分反應模型，並可透過調整因子來鬆綁模式應用限制，如參考點相對位置及車道數變化等。另基於現階段國內交通資料的蒐集仍以固定式車輛偵測器為主，本研究選擇以封閉式直線路段之速率資料作為模式示範插捕對象，並選擇不同道路等級與區域進行校估及驗證。SVR模式參數校估可分為兩部分，其一為模式內部運作參數之懲罰系數C、核函數的寬度係數γ，以及不敏感損失函數的寬度ε，並以先設ε值進行K-fold交叉驗證進行調校(C, γ)；其二為校估轉換後之線性迴歸式的權重係數ω及截距項。　　參數校估結果顯示，在設定ε值下，未刪除極端值所校估模式內部運作參數C與γ在各情境或區域下，整體參數呈現較不一致或變動範圍較大的不穩定狀態，而經由刪除極端值後，部分子模式在不同情境或區域參數趨於穩定在較小範圍內，其中參數C多收斂於1~4之間，γ則收斂於0.03~1之間，可視為有限度的穩定。但在線性迴歸的參數校估結果顯示，每筆支援向量的權重係數值皆不一致，以及截距項也並不穩定於一定範圍內。另外，在模式預測能力驗證結果，除了少數國一北區情境2、4預測結果較差(MAPE>20%)，其餘情境自我驗證大致可達到高精確的績效(MAPE<10%)。若以全面性交叉驗證選出代表性參數模型預測於不同地區與情境的結果，整體有良好的預測能力(MAPE=10.57%)。　　本研究另以相同資料對照比較基於轉換函數之快速反應模型，總體而言兩者模式在實務操作上具簡易、控制性佳，以及模式運用的泛用性、預測準確性皆有良好的優勢。但在操作上的差異，SVR在內部核函數選擇，以及將資料轉換過程較為複雜；而轉換函數需先判別時序資料是否需要差分，並判讀其干擾項以及衝擊反應權數的階次，其操作過程略為複雜。在參數校估結果上，SVR較不如轉換函數具有穩定的現象。

關鍵字

遺失值；支援向量迴歸；差分模型；插補模式

並列摘要

Real-time traffic data serve as fundamental necessity for traffic authorities or planners to monitor vehicular flows over the road networks in order to develop management strategies or provide travel information to road users. However, these crucial data can be disrupted and missing due to problems between among collecting, transmitting, or handling processes and therefore poses serious consequence. As such, how to deal with traffic missing data becomes an important issue. Various interpolating methods have been proposed with limited success due to the facts that none is perfect without shortcomings; some inherit theoretical presumptions and conditions, some may impose complex computational processes, some require proprietary toolboxes (black boxes) or special- purpose programs, some need to employ massive historical data for pattern matching. Most of all, suffered with limited validations, none can be declared as the best and applicable to all situations due to lacking of stability and transferability in model parameters. 　　The Support Vector Regression (SVR) derived from the mechanical learning methods of Support Vector Machine (SVM) bears the characteristics of high error tolerance, and can handle non-linear problems through transferring data into hyper linear spaces by Kernel function. Applications of SVR to transportation related problems have been successfully demonstrated in various issues such as traffic volume forecasting, travel time prediction. The purpose of this thesis is to develop SVR based model for traffic data interpolation and to investigate the model performance including accuracy of prediction and stability of parameters. 　　This study chose to explore the missing data problem of fixed-type vehicle detectors most commonly used at present time in Taiwan. The travel speed data is of particular interest to be studied. SVR models were specified with a primary form of “difference” model where an attribute variable was defined as the difference between the referenced upper stream and the downstream speeds; and several extension forms for adjustments of roadway geometry conditions. Models were calibrated at several sites with two different road classes and with different traffic flow conditions. Parameter calibrations were performed by two major stages, including three parameters of kernel function and then two parameters for linear regression. Model validations were performed to include self-validation (at the same site) and cross-validation (across different sites). Results show that prediction accuracy was mostly very good with MAPE less than 10%, while the parameter stability/transferability was less satisfactory. 　　Finally, a comparative study between SVR-based models and Transfer Function-based models was implemented. The result showed that both models could generate high accurate predictions with relative quick operations using generally accessible computer software respectively. However, transfer function technique seemed to be with higher parameter stability.

並列關鍵字

Missing Value ； Support Vector Regression ； Difference Model ； Interpolating Model

參考文獻

1. 呂奇傑、李天行、高人龍、黃敏菁(2009)，「支援向量機與支援向量迴歸於財務時間序列預測之應用」，數據分析，第二期第四卷，頁35-56。

3. 邱孟佑(2010)，「以交通狀態為基礎之旅行時間預測」，國立交通大學交通運輸研究所博士論文。

6. 陳首源(2007)，「結合移動式與固定式偵測器資料以轉換函數推估旅行時間」，淡江大學運輸科學研究所碩士論文。

8. 陳宛靜(2015)，「構建固定式車輛偵測器遺失值之快速插補模式」，淡江大學運輸科學研究所碩士論文。

10. 許程詠(2011)，「利用灰色理論於偵測器遺失資料插補之研究」，國立交通大學運輸科技與管理學系碩士班碩士論文。

國際替代計量

應用支援向量迴歸於交通資料遺失值之插補：以固定式車輛偵測器資料為例

全文下載

主題瀏覽