透過您的圖書館登入
IP:3.15.143.181
  • 學位論文

運用自我組織圖技術於自動化影像註解之研究

A Study of Automatic Image Annotation Using Self-organizing Maps

指導教授 : 楊新章

摘要


隨著多媒體器材逐漸普及,影像資料庫的龐大,人們想要的資訊已經從文字漸漸轉變成影像,常見的影像檢索大部分都是利用影像的低階特徵如:顏色、形狀、紋理為主,來進行檢索,但這種方式容易產生語義鴻溝(semantic gap)的問題。精確率高的關鍵字檢索(keyword-based image retrieval),要將每張影像進行人工註解,需要大量的人力與時間,不適合運用在龐大的資料庫上。 本論文希望提出一套自動化註解的方法,利用訓練資料將影像和文字註解分開處理,影像的部份將影像切割成區域,利用顏色的特徵萃取,產生一組影像向量,文字的部份透過前處理,也產生一組文字向量,將兩組向量經由自我組織圖(self-organizing map)的分群,產生影像和文字的分群圖,再利用分群圖將新增的影像自動附加註解,在實驗評估的部份,我們利用重疊率與求全率(recall)與準確率(precision),來判定本系統結果的優劣。

並列摘要


Gradually popularizes along with the multimedia equipment, image information bank huge, the people want the information already gradually transformed the image from the writing, the common image retrieved majority of all was uses the image the low step characteristic for example: The color, the shape, the texture primarily, carry on the retrieval, but this way is easy has the semantic gap (Semantic Gap) the question, the precise rate high essential character retrieval (Keyword-Based Image retrieval), must carry on each image the artificial illustration, needs the massive manpower and the time, does not suit the utilization on the huge information bank. The present paper hoped proposed a set of automated illustrations method, separates using the training material the image and the writing illustration processing, the image part cuts the image the region, the use color characteristic extract, has group of images vectors, before the writing partial penetration processes, also has group of writing vectors, two groups of vectors by way of self- drafted pattern (Self-Organizing Map) hiving off, has the image and the writing hives off the chart, the use hives off the image automatic attachment illustration which the chart will increase, will appraise the part at the experiment, use overlapping rate and seeking perfection rate (recall) & is accurate rate (precision), Determines this system to hive off the result fit and unfit quality.

參考文獻


[2] AI Khatib, W., Day, Y. F., Ghafoor, A., and Berra, B. (1999) “Semantic modeling and knowledge representation in multimedia databases.” IEEE Transactions on Knowledge And Data Engineering, 11(1), pp. 64-80
[7] Colombo, C., DelBimbo, A., and Pala, P. (1999) “Semantics in visual information retrieval.” IEEE Multimedia, 6(3), pp. 38-53.
[9] Carson, C., Belongie, S., Greenspan, H., and Malik, J. (2002) “Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying.” IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(8), pp. 1026-1038.
[11] Duygulu, P., Barnard, K., de Fretias, N., and Forsyth, D. (2002) “Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary.” In Proceedings of the European Conference on Computer Vision, pp. 97-112.
[13] Furht, B. and Marques, O. (2002) “Content-Based Image and Video Retrieval, ” Kluwer Academic Publisher.

延伸閱讀