  • 學位論文


Global spatiotemporal representations and feature extraction in video sequences

指導教授 : 蔡篤銘


智慧型視訊監控系統(Intelligent video surveillance systems)已成為安全監控的重要工具,視訊影像具有移動主體的時間與空間的相關資訊,為了有效表達不同行為之間的空間與時間變化,必須使用動作表達方法以描述各種不同的活動事件,其中根據使用策略上可分為微觀式與全域式,微觀式為將人物主體細分肢體動作,可表達主體之細部動作,但過程所耗費時間長,而全域式則直接標示影像中的所有移動點隨時間的變化情形,不需要複雜的人體肢節模式。在各種不同動作行為中皆具有速度快慢與方向亂度兩種特性,透過此兩種特性則可表達不同行為之間的差異,因此本研究主要為評估現有之全域式動作表達方法,並提出一改良式之動作表達方式,設計及萃取每個全域動作表達方法中有關人物主體的移動速度及方向亂度兩大動作特徵指標,藉此分析不同動作表達方式之效果,此結果有助於提昇自動監控系統偵測能力,可用於居家安全監控、人機互動及特定異常行為偵測等應用,並可結合智慧型手機,達到隨時隨地進行安全監控之目的。 本研究全域式的表達方式主要分為光流(Optical flow)表達與動態歷史(Motion History)表達兩類,光流表達方法之主要概念為藉由計算二張連續時序影像之每一空間位置的移動向量,用此向量表示移動物體每一座標點之移動量與方向,其特性為在描述個體點的移動位移(即速度與方向)資訊較直接,但其位移資訊僅考慮視訊中兩張相鄰影像而無法表達連續時序之完整行為;動態歷史表達方法其概念為給予前景點一動態能量來表示,隨著時間越遠,能量遞減越多,以能量值高低代表時間遠近與空間往返的關係,其特性為可較完整表達長時序動作行為,但無法由此圖直接得到主體速度與方向;因此本研究結合光流與動態歷史表達之優點而提出以光流法為基礎之動態能量圖,其能量可表達主體軌跡與時間的關係,同時可由能量高低更精確描述行為速度與方向亂度,可提升動作表達能力。 本研究設計一系列速度與方向變化漸增之測試視訊影像,以及使用BEHAVE、KTH與Weizmann等公開影像資料庫作為探討對象,藉由相關係數與Fisher Ratio作為評估速度與方向亂度指標的有效性以及行為分類的辨識度,實驗結果得知光流法因僅考慮兩張相鄰影像相較於動態歷史表達之描述性差,本研究以光流法為基礎之動態能量圖之表達性相較於上述兩類方法表達性較好。


This research evaluates the global motion representations of video sequences based on optical flow and motion history methods. The two most important motion features in a scene, moving speed and moving direction, are extracted from the spatiotemporal representation and are used to evaluate the performance of the representation. The speed feature is defined by the mean of foreground magnitudes and the direction feature is given by the entropy of directional angles for all pixels in the scene image. The optical flow techniques evaluated include the Horn-Schunck (H-S) and Lucas-Kanade (L-K) differential methods. They allow the direct extraction of speed and direction information of individual pixels, but cannot describe the complete cycle of an activity. The motion history techniques evaluated include Motion History Image (MHI) and exponential MHI. They do not give explicit motion features of speed and directions, but they can well represent the whole cycle of an activity. A hybrid spatiotemporal representation that incorporates the advantages of both optical flow and motion history is also proposed in this study. The applications of the motion representations and their extracted motion features for radical event detection and activity classification are demonstrated in this study. The video sequences with increasing speeds of movement and increasing complexity of moving directions and the public BEHAVE, Weizmann and KTH activity datasets are used for the test. Experimental results show the optical flow techniques can well describe speed and direction over consecutive images in the video and motion history techniques can better represent motion patterns and are good for activity recognition. The proposed hybrid representation gives overall the best performance.


Barron, J. L., D. J. Fleet, and S. S. Beauchemin, 1994, “Performance of optical flow techniques,” International Journal of Computer Vision, Vol. 12, pp. 43-77.
Bobick, A. F. and J. W. Davis, 2001, “The recognition of human movement using temporal templates,” IEEE Trans. Pattern Anal. And Machine Intell., Vol. 23, No. 3, pp. 257-267
Chang, C. Y., H. J. Wang, and S. Y. Lin, 2006, “Simulation studies of two-layer hopfield neural networks for automatic wafer defect inspection,” Lecture Notes in Computer Science, Vol. 4031, pp.1119-1126.
D. M. Tsai and H. Y. Tsai, 2010, “Low-contrast surface inspection of mura defects in liquid crystal displays using optical flow-based motion analysis,” Machine Vision and Applications.
Sandor Fazekas and Dmitry Chetverikov, 2007, “Analysis and performance evaluation of optical flow features for dynamic texture recognition”, Image Communication, Vol. 22, pp. 680-691.


