透過您的圖書館登入
IP:3.137.181.52
  • 學位論文

可解釋性深度學習於多標注者圖像語意分割

Learning Interpretable Semantic Segmentation from Multi-Annotators

指導教授 : 王鈺強
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


為了解深度學習模型如何進行分類預測,近期不少研究轉向發展模型可解釋性,然而目前多數研究無法直接應用至語意分割任務上,更無法在多標注者圖像語意分割問題上提供模型可解釋性。針對多標注者圖像語意分割任務,本研究旨在透過回答兩個問題來實現可解釋性模型:「誰」的標注影響預測結果,以及「為何」模型會受到該標注者影響。本研究中,我們提出了 Tendency-and-Assignment Explainable (TAX) 訓練框架,使模型能給提供「標注者」與「指派原因」兩層次的解釋。在 TAX 訓練框架下,多組捲積核負責學習不同標注者的標注傾向(標注偏好),而 prototype bank 利用圖像資訊來引導多組捲積核的學習。本研究實驗結果顯示,TAX 不僅能夠結合目前最新的網路架構以達到優良的語意分割效果,同時能針對「標注者」與「指派原因」兩面向提供令人滿意的可解釋性。

並列摘要


To understand how deep neural networks perform classification predictions, recent research attention has been focusing on developing techniques to offer desirable explanations. However, most existing methods cannot be easily applied for semantic segmentation; moreover, they are not designed to offer interpretability under the multi-annotator setting. Instead of viewing ground-truth pixel-level labels annotated by the same expert or with consistent labeling tendency, we aim at providing interpretable semantic segmentation, which answers two critical yet practical questions: “who” contributes to the resulting segmentation, and “why” such an assignment is determined. In this thesis, we present a unique Tendency-and-Assignment Explainable (TAX) learning framework, which is designed to offer interpretability at the annotator and assignment levels. With our TAX, convolution kernel subsets are derived for modeling the desirable labeling tendencies, while a prototype bank is jointly observed to offer visual guidance for learning the above kernels. In our experiments, we show that our TAX can be applied to state-of-the-art network architectures with comparable segmentation performances, while satisfactory interpretability at both levels can be properly realized.

參考文獻


[1] P. Dabkowski and Y. Gal, “Real time image saliency for black box classifiers,” in NeurIPS, 2017.
[2] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in ICCV, 2017.
[3] D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg, “Smoothgrad: removing noise by adding noise,” ArXiv, 2017.
[4] S. Xu, S. Venugopalan, and M. Sundararajan, “Attribution in scale and space,” in CVPR, 2020.
[5] A. Kapishnikov, T. Bolukbasi, F. Viégas, and M. Terry, “Xrai: Better attributions through regions,” in ICCV, 2019.

延伸閱讀