隨著技術的不斷發展,我們的電腦有越來越多的計算能力,這帶動了很多智 慧型應用出現像微笑快門,自動監控系統,智能汽車和智能家居。這些智能機器 可以感知周圍比方說人的偵測,並提供安全,方便和效率的控管,以幫助人類。 在這篇論文中,這些智慧型應用,被稱為以人類為中心的智慧型應用,也就是起 源自人類的需求的應用程序。在這篇論文中,我們著重在以人為中心的識別應用, 如面部識別,物體識別和動作識別,。另一方面,因為我們是在電台配備的電腦 主宰的時代,多媒體數據量增長非常快。 YouTube已經超過35小時的影片在2010 年被上傳到視頻共享網站每分鐘。在這樣的速度,我們需要每年處理超過一個 zettabyte 的信息。因此,支持各種智能應用和管理這個龐大的數據量,我們需 要一個有效的和可擴展的硬件平台,以提供所需的計算能力。最終的目標是接近 類似人類的智能。視覺皮層的結構和功能,為建設智慧型辨識系統,模仿一直是 一個主要的方法來實現類似人類的智能視覺系統 。在這篇論文中,我們開始探 索大腦的計算方式和架構,然後設計視覺識別的仿大腦運算系統,它可以很容易 在為符合未來智能應用而在資源量上作擴展。從新皮層計算機(NC)的模型設計, 的新皮層計算系統設計,然後的以人為中心的數控架構基於FPGA 系統整個系統 的設計流量。新皮質運算模型提供了所需的以人為中心的智慧型應用程序的功能。 新皮質運算硬體架構是新皮質運算模型經由模型優化且有效率和可擴展的硬體 平台。利用FPGA 系統驗證新皮質運算系統。在這篇論文中,主要的系統設計策 略是提供應用的多樣性和人類大腦的效率。 首先,我們分析當前的新皮質運算模型,並發現它們缺乏時域上的統合,從 而努力探索的時間有關動作辨識的的對象識別。為了解決這個問題,從人腦系統 的信息傳輸性質和神經網絡研究的啟發,我們提出了一個回饋的計算核心,有效 整合時域上功能的訊息。因此,我們可以構建一個有效的降維的Reservoir Kernel 具有時間記憶的訊息,因此可以整合時間信息提供給HMAX 辨識系統,提 高其識別性能(特別在動作辨識上)。實驗結果表明,它可以大幅超越最先進的 HMMSVM 方法。 第二,我們分析新皮質運算系統以實現新皮質運算硬體架構,,並說明其主 要的問題 - 大量的數據存取,導致電源效率低下,冗餘外部頻寬利用率,速度 慢,沒有通訊上的可擴展性。這個問題在目前的計算系統,使新皮質運算系統成 為一個內存有限的系統。為了解決這個問題,從神經元的信息傳遞的啟發,我們 提出了一個基於Push-based的數據流(Push-DF)結構減少對外部存儲器的請求。 從實驗結果,相對於RISC 和GPU,在多核心架構的Push-DF 可以實現更低的延 遲,功耗和外部頻寬。利用Push-based 處理,大大減少了大量的外部存儲器請 求,使我們的新皮質運算系統可以打破傳統的內存有限的系統瓶頸。這一重要功 能提供了通訊上的可擴展性,我們的新皮質運算系統符合一個可擴展的大腦模仿 硬件平台的設計目標。最後,我們利用Push-DF 的結構設計新皮質運算系統,並 實現在FPGA 系統的8 核心NCSoC。最後我們的NCSoC 的辨識系統需要0.179 秒 辨識100x100 的圖片。總之,NCSoC 支持各式各樣智慧型識別任務的新皮質運算 模型,並提供更好的性能,效率和過電流計算平台的可擴展性。因此,它支持各 式各樣智慧型以人為中心的應用和管理大量的多媒體數據,為未來的應用潛力。
As the technologies continue to evolve, our computers have more and more computing capacity, which drives a lot of intelligent applications to emerge like smile shutter, automatic surveillance system, smart car and smart home. These smart machines can sense the surrounding like human and provide safety, convenience and efficiency to help human. These intelligent applications in this thesis are called human-centric applications which based on the needs of human. In this thesis, we focus on the human-centric recognition applications,such as face recognition, object recognition and action recognition. On the other hand, since we are in the era where radio equipped computers dominate, the amount of multimedia data is growing extremely fast. Youtube have reported that more than 35 hours of video are being uploaded to the video-sharing site every minute in 2010. In this rate, we need to handle over one zettabyte of information annually. Therefore, to support various intelligent applications and manage this huge amount of data, we need an efficient and scalable hardware platform to provide the required computation capability. The ultimate goal is to approach human-like intelligence. For building an intelligent machine, mimicking the structures and functions of visual cortex has always been a major approach to implement a human-like intelligent visual system. In this thesis, we started from exploring brain’s computing style and architecture, then designed a brainlike computing system for visual recognition,which can be easily scalable with the amount of resources for future intelligent applications. The whole system design flow starts from Neocortical Computing (NC) model design, Neocortical Computing System design and then the real-time human-centric NC architecture based on FPGA system. NC model provides the functionality for required intelligent human-centric applications. NC architecture is an efficient and scalable hardware platform optimized for NC model. And FPGA system verify the NC system by transforming the NC model into the specific memory content that can be interpreted by platform. In this thesis, the main system design strategy is to provide the application diversity and efficiency as human brains. At first, we analyze the current NC models and find that they are lack of the temporal domain integration and thus are hard to explore the object recognition into time-relevant action recognition. To solve this problem, inspired from the human brain system’s recurrent information transmission nature and neuron network research, we proposed a recurrent computing kernel to integrate the temporal domain action feature information efficiently. Therefore we could construct an efficient dimension-lifting Reservoir Kernel which exhibits the property of temporal memory and thus can integrate the temporal information provided by the HMAX network and boost up its recognition performances. Experimental results showed that it can outperform the state-of-the-art HMMSVM method substantially. Second, for the NC system design of NC model, we analyze the computation of NC model and state its main problem – massive data access, which results in power inefficiency, redundant external bandwidth usage, slow response and no communication scalability. In current computing system, this problem causes the NC system becomes a memory-bounded system. To address this issue, inspired from the information forwarding scheme of neurons, we proposed a Push-based Dataflow (Push-DF) structure using push-based processing for external memory access reduction and efficient sparse data forwarding. From the experimental result, the Push-DF in many-core architecture can achieve lower latency, power consumption and external bandwidth than RISC and GPU. Utilizing push-based processing greatly reduces the massive external memory access so that our NC system can break the bottleneck of traditional memory-bounded system. This important feature provides the communication scalability of our NC system, which meets the design goal for a scalable brain-mimicking hardware platform. At last, we utilized the proposed Push-DF structure for designing NC system and implemented a 8-core NCSoC in FPGA system. Our final implementation of NCSoC takes 0:179 seconds to recognize a 100×100 image. In conclusion, NCSoC supports NC model for various intelligent recognition tasks, and provides better performance, efficiency and scalability over current computing platform. As a result, it have the potential to support various intelligent applications and manage huge amount of multimedia data for future applications.