透過您的圖書館登入
IP:3.19.58.30
  • 學位論文

針對智慧型辨識應用之大腦啟發新皮質運算演算法與架構設計

Brain-inspired Neocortical Computing Algorithm and Architecture Design for Intelligent Visual Recognition Applications

指導教授 : 陳良基

摘要


由於IC科技在摩爾定律的不斷推動下快速進展,許多過去僅能在大型電腦主機中運算的智慧型視訊資料分析應用,已開始逐漸進入一般使用者的日常生活中。而當我們準備更進一步邁向未來的視訊辨識應用時,將有更多多種多樣的應用需求(如智慧型監視系統、無人駕駛車輛等)需要被解決。因此,發展一個具有廣泛可應用性、低功耗且即時運作的智慧型視訊辨識硬體,將會是一重要且不可避免的研究趨勢,而試圖在各應用中達到類大腦智慧的水準,也無疑地將會是最重要的目標之一。在本論文中,我們首先將會簡介基礎的視覺神經科學,以及我們稱為「新皮質運算模型」的新一代仿大腦辨識演算法背後之設計概念與理論。在第二章中,我們將會介紹一做為我們研究發展框架的基礎新皮質運算演算法-HMAX。接著在第三章中,我們將會討論現有的HMAX演算法於未來應用發展趨勢下的不足之處。亦即,在邁向許多更具挑戰性、更接近實際生活需求的應用情境中,HMAX對於以下兩種重要的應用方向並無法有效地支援:1)動作�活動之視訊辨識,及2)大規模之圖像�視訊辨識與學習。為解決第一個問題,我們提出一基於大腦啟發之Reservoir Kernel的先進新皮質運算演算法。Reservoir Kernel具有提高特徵向量維度,同時有效整合HMAX網路所擷取之短時間單元動作資訊之特性。此演算法在一最新提出之人類動作�活動視訊資料評比中獲得超過1.4倍之辨識率增進。為解決第二個問題,我們提出基於大腦啟發的Feature Selective Hashing方法,用以有效率地索引�查詢所學習過的各種物品。實驗數據顯示,此演算法能在僅有1%準確度誤差之情況下,節省高達90%的辨識時間複雜度。在第四與第五章中,我們將介紹所提出之新皮質運算處理器架構,包含仿大腦灰質之36核同質核心架構(特點為事件觸發式混合型MIMD執行)與仿大腦白質之高茲晶片網路架構(特點為主動式故障�壅塞迴避、無冗餘式群播)。基於所提出之架構特點,我們解決的設計挑戰包含:1)架構可擴增性之要求、2)每秒十億次運算等級之計算複雜度,及3)每秒兆位元等級之資料頻寬需求,且此架構可以極高之效率加速仿大腦新皮質運算演算法,因而使得建構一具廣泛可應用性、低功耗且即時運作的視訊辨識硬體之目標得以達成。此設計使用TSMC 65奈米製程,晶片面積為4.5×4.5mm2,可達到360GOPS之峰值運算效能及2.3Tb/s之晶片網路總頻寬,運作於250MHz頻率與1.0V電壓情況下之平均功耗為205mW。相較於現有最佳的視訊辨識處理器,此設計之能源效率更為優良,達到1.0TOPS/W(整體運算能源效率)與151Tb/s/W(晶片網路傳輸能源效率)。此設計支援多種新皮質運算應用,包含物品�臉部�場景之圖像辨識(支援128×128或256×256之解析度)及動作�運動之視訊辨識(支援128×128之解析度)等,最高辨識速度可達每秒130幀畫面。總結來說,本論文完整展示了我們對於大腦啟發之新皮質運算演算法與架構設計之探索與實現,所發展的技術將可提供發展未來各種智慧型視訊辨識應用所需。

並列摘要


Thanks to the ceaseless driving force of the Moore's law, intelligent visual data analytics which could be done only with gigantic mainframe computers has now started to penetrate into our daily lives. As we are moving toward the future visual recognition applications, in which a lot more possibilities (e.g. intelligent surveillance, driver-less cars, etc.) can emerge, developing a widely-applicable, low-power and real-time intelligent visual recognition hardware is an inevitable research trend. And among all research goals, attaining human brain-like performances is undoubtedly the ultimate one. In this dissertation, we will first review the basic visual neuroscience and the fundamental design concepts and theories behind the rising brain-mimicking recognition algorithms, which we called the Neocortical Computing (NC) model. In Chapter 2, we will introduce the basic NC algorithm -- HMAX as our starting framework, which has demonstrated promising performances on image recognition and basic video recognition applications. Then in Chapter 3, we will discuss the deficiencies of the basic HMAX in future applications, where we will have to extend it to more difficult and closer-to-real-life recognition tasks like 1) action/activity video recognition and 2) large-scale image/video recognition and learning. To address the first issue, we proposed an advanced NC algorithm that combines the HMAX with a brain-inspired Reservoir Kernel, which can function as a dimension-lifting kernel with temporal memory that integrates the shorter temporal information (atomic actions) extracted by the HMAX network. Experimental results show over 1.4x recognition accuracy increase when running on the latest human action/activity dataset. To address the second issue, we proposed a brain-inspired Feature-Selective Hashing scheme for indexing/searching the object instances efficiently. Experimental results show that it can reduce at most 90% of recognition time with less than 1% accuracy drop, and it also provides computation scalability when the number of learned object instances increases. In Chapter 4 and 5, we will introduce the proposed NC processor's architecture, including the grey matter-like homogeneous 36-core architecture with event-driven hybrid MIMD execution and white matter-like Kautz NoC architecture with fault/congestion avoidance and redundancy-free multicast. Based on these design features, the proposed architecture successfully solves the design challenges including 1) scalability requirement, 2) GOPS-level computation complexity and 3) Tb/s-level communication bandwidth requirement, and can efficiently accelerate the brain-mimicking NC algorithms; thus the goal of widely-applicable power-efficient real-time visual recognition is also reached. It is implemented using TSMC 65nm technology on a 4.5x4.5mm2 die with 360GOPS peak performance, 2.3Tb/s aggregated NoC bandwidth and 205mW average power consumption when running at 250MHz and 1.0V. It achieves 1.0TOPS/W overall power efficiency and 151Tb/s/W NoC power efficiency, which are both higher than state-of-the-art visual recognition processors. NC applications, including object/face/scene image recognition (128x128 or 256x256) and action/sport video recognition (128x128) can be executed at speed up to 130fps. To sum up, this dissertation presents our exploration and realization of the brain-inspired Neocortical Computing algorithm and architecture, which can serve a wide range of intelligent visual recognition applications.

參考文獻


[1] N. Logothetis, “Vision: a window on consciousness,” Scientific American (American Edition), vol. 281, pp. 68–75, 1999.
[3] T. Serre and T. Poggio, “A neuromorphic approach to computer vision,” Communications of the ACM, vol. 53, no. 10, pp. 54–61, 2010.
[4] J. DiCarlo and D. Cox, “Untangling invariant object recognition,” Trends in cognitive sciences, vol. 11, no. 8, pp. 333–341, 2007.
[5] J. DiCarlo, D. Zoccolan, and N. Rust, “How does the brain solve visual object recognition?,” Neuron, vol. 73, no. 3, pp. 415–434, 2012.
[6] J. Mutch and D. Lowe, “Object class recognition and localization using sparse features with limited receptive fields,” International Journal of Computer Vision, vol. 80, no. 1, pp. 45–57, 2008.

延伸閱讀