透過您的圖書館登入
IP:18.222.155.58
  • 學位論文

用於領域泛化之具有輔助語義學習的特徵空間中之風格替換

Style Replacement in Feature Space with Auxiliary Semantic Learning for Domain Generalization

指導教授 : 許秋婷

摘要


在處理新數據時,神經網絡模型之泛化能力是現實世界應用中一個重要的議 題。域泛化旨在利用來自多個不同源域的數據來訓練一個模型,該模型可以直 接泛化到任何在訓練階段不可使用的未知目標域。在本論文中,我們專注於圖 像分類任務的域泛化,並提出了一個端到端且獨特的風格替換及語義學習框架 (稱為SRNet),主要有兩個構想。第一個是新穎的風格替換方法,它促進我 們的網絡模型提取風格不變的特徵。另外,為了更進一步促進語義特徵學習, 我們提出的方法包含了一個輔助的自監督任務,該任務預測轉換圖像的轉換類 型。通過將風格替換與輔助圖像轉換預測任務結合,我們訓練一個模型,通過 根據圖像的高級語義特徵或全局對象形狀對圖像進行分類,將跨域知識轉移到 未知目標域。在PACS 和VLCS 兩個領域泛化基准上的實驗結果顯示我們提出的 方法有效,且比過去的方法擁有更好的效果。

並列摘要


The generalization ability of deep neural network model is a crucial issue in real-world applications when dealing with new data. Domain generalization relies on data of multiple source domains to learn a model which is capable of generalizing well to any unknown target domain that is unavailable during training. In this thesis, we focus on domain generalization for image recognition task and propose an end-to-end and multi-task learning framework (called SRNet) with two main ideas. First, we propose a novel style replacement method to encourage our model to extract style-invariant features. Second, to further boost the semantic feature learning, we include an auxiliary task to predict the transformation type of a transformed image in a self-supervised way. By combining the style replacement method with the auxiliary image transformation prediction task, we train a model to transfer the cross-domain knowledge to unknown target domains by classifying images according to their high-level semantic features or global object shapes. Experimental results on two domain generalization benchmarks, PACS and VLCS, demonstrate that our proposed method is effective and attains superior performance over previous methods.

參考文獻


[1] Imad Eddine Ibrahim Bekkouch, Dragoş Constantin Nicolae, Adil Khan, SM Ahsan Kazmi, Asad Masood Khattak, and Bulat Ibragimov. Adversarial reconstruction loss for domain generalization. IEEE Access, 9:42424–42437, 2021.
[2] Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2229–2238, 2019.
[3] Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, and Neil Houlsby. Selfsupervised gans via auxiliary rotation loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12154–12163, 2019.
[4] Myung Jin Choi, Joseph J Lim, Antonio Torralba, and Alan S Willsky. Exploiting hierarchical context on a large database of object categories. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 129–136. IEEE, 2010.

延伸閱讀