基於搜尋引擎技術之影像搜尋系統

影像搜尋已經發展相當長的時間，目前其中一個尚未被解決的問題就是當資料庫中有相當大數量的影像時，由於影像所包含的資訊非常多，會消耗大量的處理與搜尋的時間。目前網路上搜尋引擎的核心技術，資訊檢索技術，其領域已有數十年的發展，資訊檢索技術有一些成熟並被廣泛使用的搜尋資訊的方法，其在搜尋大量的資料上亦有一些被公認較好的辦法。因此為了解決在大量影像資料庫中會需要大量時間的問題，我們提出的簡單的解決辦法是將影像搜尋對應至文字搜尋，以此來應用資訊檢索已有的技術。我們由影像中得到一些「影像字」來對應到文章中的「文字」，我們的方法主要分以下三個部分： (一) 建立影像字。由影像抽取對應至「文字」的「影像字」，此部分由等份切割的影像區塊上得到影像之影像字，並使用兩種不同觀點的方法製作影像字。 (二) 建立索引。建立可查知影像與影像字之間關係的索引，此關係包含了影像與影像字之間的權重關係。此部分重點在於利用文字搜尋的反向索引方法，使搜尋時所需的時間與計算資源減少，並且使用TF-IDF方法得到影像字在影像上的權重。 (三) 影像搜尋。利用已建立索引，搜尋相似影像，此處將使用兩個方法(1)Count Match (2)Vector Model計算查詢影像與資料庫影像的相似度並以此作排序。最後將在拍賣網頁的影像資料庫上做以上述各方法的效能和效果評估。

關鍵字

影像搜尋；內容式影像檢索資訊檢索資料庫；色彩直方圖

並列摘要

Content-based image retrieval has been an important research topic for a long time. However, to reduce the search time in a large image database remains a challenge problem. The information retrieval (IR), which is the core of the text search engine techniques, has some well-known and efficient methods which can be applied to search information in a large database. Therefore, our solution simply extracts some "visual words" from images, these are analogies to the "words" in articles, and we can apply those methods in the IR domain directly. Our method can be divided into the following three parts: (1) Extract visual words. We recursively divide an image into four equal-sized blocks, and then two methods are proposed to extract visual words from these blocks. (2) Build index. Create index between visual words and images, and the associated TF-IDF weights to the database. The key method in this part is the inverted index, which can reduce the time and computing resources when we searching words using the index. (3) Search images. We search the similar images of the query image on the created index. Two methods, (a) Count Match (b) Vector Model, are proposed to estimate the similarities between query image and images in the database. We have evaluated the proposed methods on the image databases crawled from the auction webpages.

並列關鍵字

Image Search ； Content-Based Image Retrieval ； Information Retrieval ； Database ； Color Histogram

參考文獻

[1] W. Y. Ma, H. J. Zhang. Benchmarking of image features for content-based retrieval. In Signals, Systems & Computers, 1998. Conference Record of the Thirty-Second Asilomar Conference.

Google Scholar

[2] H. Mueller, D. M. Squire, W. Mueller, T. Pun. Efficient access methods for content-based image retrieval with inverted files. In Proc. SPIE, 1999

Google Scholar

[3] J. Li, J. Z. Wang, G. Wiederhold. IRM: integrated region matching for image retrieval. In Proc. ACM Multimedia, pages 147-156, 2000.

Google Scholar

[4] F. Jing, M. J. Li, H. J. Zhang, B. Zhang. An efficient and effective region-based image retrieval framework. In IEEE Trans. on Image Processing, volume 13, issue 5, pages 699- 709, 2004

Google Scholar

[5] W. B. Frakes, R. Baeza-Yates. Information retrieval – data structure & algorithms. Prentice Hall, 1992, ISBN:0134638379

Google Scholar

國際替代計量

基於搜尋引擎技術之影像搜尋系統

全文下載

主題瀏覽