  • 學位論文


Distributed On-line Object Tracking in a Video Sensor Network

指導教授 : 簡韶逸


在傳統的視訊感測器網路裡面,每台攝影機把各自的畫面傳回到伺服器上才進行分析,不過隨著環境中的攝影機數量越來越多,頻寬越來越不能負荷的這樣的傳輸量,攝影機所消耗的功率也越來越可觀,如果能採用分散式計算,再將重要的結果傳回伺服器,將會是一大突破。除了分散式外,我們也希望這樣的系統能夠及時得到結果,而且可攜性高,在一個新環境架設完成之後,能夠在不需要人為標記的前提下馬上運作。 另一方面,在不同的攝影機下辨認是不是同一個人仍然是一個困難的問題,因為同一個人在不同的攝影機下被照到的角度不一樣,有時候會被其他人擋住,有時候移動到畫面的遠處,看起來很小很模糊,這些情況都大大的影響了判斷的準確度,也顯示人為定義特徵的不足,很希望能利用卷積神經網路讓系統自己學習應該用哪些特徵來得到比較理想的結果。 這篇碩士論文提出了一個可以及時得到結果、基於卷積神經網路的系統,並利用視訊感測器網路的時間和空間線索調整配對結果,最後改善視訊特徵抽取方式,讓系統的精確度可以達到百分之八十以上,同時再現性達到百分之九十以上。與此同時,目前可取得的視訊測試資料各有不大適合應用在視訊感測器網路的地方,於是本篇論文也提供了新的視訊測試資料和一套改良的評量方式,用以評估基於此應用而開發的系統。 在論文的最後,系統得到的配對結果進一步用兩種方法視覺化,一種是哈利波特裡面的劫盜地圖,另一種則是個人化影片,希望藉由開發這樣的系統,讓視訊感測器網路的相關研究更貼近現實世界的應用。


In traditional video sensor network, analysis starts after the collection of videos from each video sensor. However, with the number of video sensors growing in the environment, the limited bandwidth can hardly handle this kind of transmission, and power consumption becomes considerable. Consequently, distributed computation, on-line algorithm, and portable system that does not require manually labeling is needed. On the other hand, human tracking in a video sensor network has been a challenging problem owing to pose variation, low resolution, and occlusion. At this moment, neural network based systems are expected to solve the problem by their great learning power. In this thesis, a convolutional neural network based video sensor network tracking system is proposed and improved. Furthermore, a newly collected dataset as well as a novel benchmark is provided for evaluation of video sensor network tracking systems. The proposed system reaches a promising result and further visualizes it, showing the possibility to realize the system in the real world.


[1] Zhao, Rui, Wanli Ouyang, and Xiaogang Wang. "Unsupervised salience learning for person re-identification." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013.
[2] Das, Abir, Anirban Chakraborty, and Amit K. Roy-Chowdhury. "Consistent re-identification in a camera network." European Conference on Computer Vision. Springer International Publishing, 2014.
[3] Li, Wei, et al. "Deepreid: Deep filter pairing neural network for person re-identification." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
[4] McLaughlin, N., J. Martinez del Rincon, and P. Miller. "Recurrent Convolutional Network for Video-based Person Re-Identification." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[5] Xiao, Tong, et al. "Learning deep feature representations with domain guided dropout for person re-identification." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
