透過您的圖書館登入
IP:18.119.28.237
  • 學位論文

基於圖片內容擷取之圖像語意擴充與其影片廣告安插之應用

Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising

指導教授 : 徐宏民

摘要


基於影像內容的擷取(CBIR),是一種有效管理數量以指數成長的圖片之關鍵技術,並可應用於諸如搜尋式的相片標註(annotation by search),計算相片學(computational photography)和圖片的問答(photo-based question answering)。雖然經過數十年的研究,現有的方法仍然受限於語意隔閡(semantic gap)。於此論文我們提出一種發掘蓬勃發展的多媒體分享服務(如Flickr)和搜尋引擎(如Google)去取得外界知識(諸如:相片、相片標籤、部落格、輔助資料)來建立圖片之間的語意關係來改善CBIR成效的技術。此外,我們也提出一個系統來呈現如何善用語意擴充(semantic expansion)。此系統結合影像比對技術(包含諸如:商標、地標的影像),自動於影片中安插廣告。此廣告推薦系統也是一種前瞻的服務將有機會增加網路商業效益。 因為傳統CBIR的技術都受制於語意隔閡(semantic gap)的問題,於是我們提出一種結合圖片視覺與文字的技術。然而這些由社群網路提供的資料夾帶著大量的雜訊,包括相片標籤等等。於是我們利用Google的知識來計算出這些相片標籤之間的語意相關性。我們利用一種基於圖論為基礎的方法,為每一種線索都建立起一個模組。經由線性組合起來,三種線索建立起的三個模組將會合併成為一個模組。於此,我們利用隨機漫步(random walk)來解此問題,以任一CBIR方法的結果視作初始值,接著以隨機漫步的方式來改善成果。因為每個模組的建構是獨立的,我們將可容易的於線性組合當中改變她們的權重以達不同需求。除此之外,我們也考慮了處理大量資料時效率的問題。同時基於這個架構下,也可延伸應用於基於文字的圖片搜尋(keyword-based image retrieval)與圖片自動註記(image annotation)。 為呈現語意擴充的用途,我們也提出了一個廣告推薦系統: AdVis。這是一個基於影像比對技術的自動安插廣告於影片當中的系統。於此,標價者將會對有興趣的影像(AdImage)進行標價,類似於AdWords的系統。此外AdVis著重於系統效益的最佳化並且考慮觀看者的感受。我們將此問題公式化於1-0 integer programming problem的問題。 實驗於Flickr資料庫上顯示語意擴充有效解決於: (1)基於圖片內容的圖片搜尋,搜尋效果比起過去方法成長了200%; (2)基於文字的圖片搜尋,搜尋結果相較於傳統的文字搜尋也表現突出,肇因於傳統文字搜尋受限於雜訊與不完整的相片標籤; (3)相片自動標籤,和過去具代表性的方法比較也有大幅度的成長。

並列摘要


Content-based image retrieval (CBIR) is one of the essential techniques for managing exponentially growing photos and the enabling technology for many applications such as annotation by search, computational photography, photo-based question and answering, etc. Though through decades of research, current solutions are limited due to the “semantic gap.” In this work, we argue to improve CBIR by automatically exploiting the auxiliary knowledge (i.e., tags, photos, blogs, metadata, etc.) from the booming media-sharing services (e.g., Flickr) and search engines (e.g., Google); i.e., finding more semantic-related images to enhance CBIR results. To demonstrate the benefits of semantic expansion, we further apply the proposed framework in a promising application – video advertising by target image matching, which automatically associates the relevant ads by content-based matching over related image objects (e.g., logos, scenes, etc.). It is one of the potential applications for Internet monetization as the prevalence of shared videos. Since traditional CBIR methods are hindered by the semantic gap, our approach argues to leverage content and context information widely available in media-sharing services. However, such community-contributed cues (e.g., tags, descriptions, image appearance, metadata) are generally noisy. We measure the semantic similarities between tags by exploiting Google knowledge. We apply graph-based approach, one graph model for each cue. Through aggregating multiple cues (as graphs) in a linear manner, we perform random walk on a unified graph originating from the initial CBIR search results to further improve the precision and recall rates. Since each graph is built independently, the weights for them are adaptive for different applications. We also consider the efficiency issues as deploying in the large-scale media-sharing sites. Meanwhile, the framework is generic and can be extended for other applications such as keyword-based image retrieval and image annotation. To demonstrate the benefits of semantic expansion, we also propose a framework called AdVis, which automatically associates the relevant ads by visual matching. Here, bidder bids ads by the interested images – adImages, which is analogous to keywords in AdWords model. AdVis aims to maximize system revenue and user perception. We formulate the solution as a nonlinear 1-0 integer programming problem. Experimenting over Flickr photo benchmarks, the proposed semantic expansion framework performs saliently in a few aspects: (1) example-based image retrieval: outperforming traditional CBIR systems up to 200%; (2) text-based image retrieval: salient performance gains over conventional keyword-based search since the latter suffers from noisy and missing tags; (3) image auto-annotation: showing significant gains over other search-based image annotation approaches.

參考文獻


[2] Rainer Liehart, Malcolm Slaney. PLSA on large scale databases, ICASSP 2007.
[3] Eva Horster, Rainer Lienhart, Malcolm Slaney. Image Retrieval on Large-Scale Image Databases, CIVR 2007.
[4] Xiaofei He, Deng Cai, Ji-Rong We, Wei-Ying Ma, Hong-Jinag Zhang. ImageSeer: Clustering and Searching WWW images using link and page layout Analysis, Microsoft TechReport 2004.
[5] Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang Image Annotation Refinement using Random Walk with Restarts. ACM Multimedia 2006.
[6] Xin-Jing Wang, Wei-Ying Ma, Cui-Ring Xue. Multi-Model Similiar Propagation and its Application for Web Image Retrieval. ACM Multimedia 2004.

延伸閱讀