透過您的圖書館登入
IP:18.216.213.126
  • 學位論文

多媒體內容分析系統之演算法與積體電路架構設計

Algorithm and VLSI Architecture Design of Multimedia Content Analysis System

指導教授 : 簡韶逸

摘要


近年來由於半導體製程技術的發展,愈來愈多的應用出現在消費性電子產品之中。各種不同功能的電子產品,例如攜帶電話,數位相機,掌上型電腦,也在時代的趨勢下逐漸整合成一種具備各種功能的完整系統。同時,記憶體的容量不斷地提昇,但是其價格與生產成本卻是不斷下降。可以預見地,未來傳統的硬碟極有可能會被取代成為擁有更大容量的先進記憶元件。在擁有了極大量的資料儲存空間之後,儲存多媒體資訊就會成為一大用途,也使得自動化的多媒體分析成為重要的應用。在消費性電子產品的嵌入式系統之中,傳統的中央處理器,以及客製化的數位積體電路無法同時滿足多媒體分析演算法的彈性以及效能須求。因此,在下一代的應用之中,發展一套新的軟硬體設計方式是很重要的。 針對多媒體內容分析的須求,作者提出了一套新的設計及實作方式。從演算法設計,硬體架構分析,軟硬體共同設計,以及單晶片系統實作,發展出一系列的方法,並且適合應用於各種消費性電子產品之中,例如行動式裝置。為了有效地分析多媒體的內容,「特徵截取」以及「機器學習」是不可或缺的步驟。現今有各種機器學習的演算法被應用著,包括監督式學習、非監督式學習……等。這些演算法被視為多媒體內容分析的重要組成元素。因此,作者提出了高效能的客製化硬體架構以及可重組化的多功能硬體架構來支援「機器學習」並且處理多媒體的內容分析。高效能客製化硬體架構的部份,由於K平均分群法是機器學習中非監督式學習的一個非常重要之演算法,作者針對此方法做了很多的分析與探討。論文中提出四種不同的K平均分群法架構,分別適用於不同的應用環境,並且展示了結合所提出硬體架構的一個軟硬體共同設計系統,用以執行自動化的相片檢索功能。可重組化的多功能硬體架構部份,為了支援不同的機器學習演算法,兩種不同特色的單晶片系統也在本論文中被提出。這兩種系統藉由密集平行化的串流處理器架構處理大量的影像「特徵截取」運算,也可以基於可重組化的硬體架構及高頻寬的記憶單元,支援不同「機器學習」演算法的運算,包括K平均分群法、K最近鄰居分類器、高斯模型分類器、支援向量機、類神經網路……等。 簡而言之,本論文提出了兩種對於視訊以及影像的切割演算法,四種不同的K平均分群法硬體架構,一個軟硬體共同設計之相片檢索系統,以及兩個支援特徵截取以及機器學習演算法之高效能單晶片系統,作為多媒體分析系統運算的一系列解決方案。

並列摘要


Nowadays, thanks to the development of semiconductor technology, there are more and more versatile applications in Consumer Electronics (CE) products. Different kinds of CE products, such as cellular phones, digital still cameras, portable computers, are gradually integrated into one single system. In the near future, a CE product might include different functionalities, including making phone calls, sending e-mail, and taking/storing photos. At the same time, the advance of memory is also astonishing. The size of flash memory is increasing, but its price is decreasing steadily. Obviously, the new memory technology might replace the traditional hard disk for data storage. Because of the development of the Internet and the large storage of data, managing multimedia content becomes an important and indispensable task. Therefore, the integration with different kinds of functionalities and the increase of multimedia data result in the necessity of automatic multimedia content analysis for CE products. In embedded systems for CE products, the traditional CPU/RISC and ASIC cannot satisfy both the flexibility and performance requirements of multimedia applications based on their architectures, so the exploration of new design methodologies and solutions are needed for next-generation applications. In this dissertation, new implementation methods and frameworks of multimedia content analysis are proposed. From algorithm designs, architectural analyses, hardware architectural designs, software/hardware co-designs, and SoC designs, a systematic approach is adopted. The proposed methods provide a series of new solutions to next-generation applications for consumer electronics (e.g. mobile devices). To effectively analyze the contents of multimedia, feature extraction and machine learning algorithms are both indispensable. There are lots of machine learning algorithms that are widely employed in different applications, and they can be regarded as essential components or building blocks for multimedia content analysis. To handle the supervised learning and unsupervised learning algorithms in machine learning, both high-performance hardware architectures and reconfigurable hardware architectures are proposed. For high-performance architectures, K-Means clustering algorithm is the focus in this dissertation because of its popularity and importance, and its applications are also demonstrated. A total of four kinds of K-Means architectures are developed. For reconfigurable hardware architectures, two System-on-a-Chip (SoC) architectures with different features are proposed. These systems can process a large amount of data in parallel and perform feature extraction with high bandwidth, and they can also deal with various kinds of machine learning algorithms, such as K-Means clustering, K-Nearest Neighbor classification, Gaussian Mixture Model-based classification, Support Vector Machine, and Artificial Neural Network. In short, the contribution of this dissertation consists essentially of two algorithms for video and image segmentation, one software/hardware co-design platform, four different kinds of architectures for K-Means clustering, and two SoCs for multimedia content analysis. The content of this dissertation can also be regarded as a series of new solutions to multimedia content analysis for CE products.

參考文獻


[1] S.-Y. Chien, S.-Y. Ma, and L.-G. Chen, “Efficient moving object segmentation algorithm using background registration technique,” IEEE Transactions
[2] S. Sural, G. Qian, and S. Pramanik, “Segmentation and histogram generation using the HSV color space for image retrieval,” in Proceedings of IEEE International Conference on Image Processing, Sep. 2002, pp. 589–592.
[3] L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Workshop on Generative-Model Based Vision, 2004.
[5] K. Kim, S. Lee, J.-Y. Kim, M. Kim, D. Kim, J.-H. Woo, and H.-J. Yoo, “A 125GOPS 583mW Network-on-Chip based parallel processor with bio-inspired
visual attention engine,” in Digest of Technical Papers of 2008 IEEE International Solid-State Circuits Conference (ISSCC2008), Feb. 2008, pp. 308–309.

延伸閱讀