近年來,隨著網際網路的盛行、各種資料儲存容量的增大,在有效處理大量資料的研究中,資訊檢索是近年來被廣泛的應用與討論的技術之一。然而,這些大量的圖片資料大多被雜亂且無組織的儲存著,增加了圖片資訊檢索的難度。過去的相關的研究中大多著重於研究圖片低階特徵的擷取及分類,無法有效解決語意與圖片低階特徵的有效對應問題。圖片搜尋網站中的搜尋所要的物品圖片,視含有圖片的網頁文字為圖片的索引關鍵字,卻無法處理字詞的同義字對應問題以作為搜尋的依據。 本研究提出利用圖片搜尋網站中的圖片網頁文字的語意特徵搭配圖片的低階特徵以協助圖像搜尋的處理機制,藉此可以有助於圖片搜尋網站的搜尋精確率。針對含有此圖片的網頁文字,我們運用向量空間模型理論,擷取出重要的關鍵字,並以線上字典WordNet處理字詞的同義和觀念擴充,並利用圖片顏色自動特徵擷取提高網路圖片搜尋精確率。 本研究在實作的過程中,分別以三種不同資料集方式來比較圖片搜尋網站,並各自探討圖片搜尋的精確率以及查全率。經過測試數據的分析與探討,驗證語意特徵以及圖片的低階特徵擷取對於提升圖片搜尋的精確率有明顯的提升。
In recent years, due to the popularity of the internet and the development of larger sized information storage capacities, information retrieval has been widely discussed and used in the study of effectively dealing with large amount of information. However, most of these tremendous image data has been stored chaotically which increases the difficulty of information retrieval. The relevant researches in the past mainly focus on the extraction and classification of low-level image features which cannot effectively solve the corresponding issue between semantic and low-level images features. Furthermore, the image searching engine uses the web text containing images as the image index keyword to search the picture in web, but the synonym corresponding problem cannot be eliminated to be the basis of searching. Therefore, this research tends to propose a method by integrating the semantic feature of the image web text with the low-level features of images to enhance the process of image searching mechanism and improve the accuracy of image searching engine. According to the vector space model theory, the method we use to deal with the web text containing images is to extract the keyword first then process the synonym and expansion of the concept by using the WordNet ontology. Finally, we use the automatic capture image color characteristics to advance the searching accuracy of the web images. In this study, three different dataset methods have been conducted respectively in comparing image searching websites and gathering image searching precision and recall. Based on the test data we have gathered and analyzed, it demonstrates that the semantic characteristics and the capture of low-level feature images have a significant imporvment of image searching.