改良通過聚類不確定性加權嵌入的主動域適應

域適應是一種在域轉移下將深度神經網絡推廣到新目標域的技術。在實踐中，我們可以利用來自目標域的標記數據，但收集完全標記的數據可能非常昂貴且耗時。為此，我們引入了域適應的主動學習，以減少數據註釋的工作量，並以信息方式註釋少量目標數據，從而最大限度地使 DA 模型受益。這種學習範式被稱為主動域適應 (ADA)。先前的工作，通過聚類不確定性加權嵌入（ADA-CLUE）的主動域適應，最先進的技術，執行不確定性加權聚類來識別目標實例進行標記。在這項工作中，我們仔細研究了 ADA-CLUE 如何處理不確定性和多樣性。我們提出了 ADA-CLUE 的兩種變體，密度加權不確定性抽樣 (DWUS) 和恆定加權聚類抽樣。 DWUS以恆定權重對目標實例的深度嵌入進行聚類，並獲取最近鄰到由目標模型的相應不確定性加權的質心的標籤。通過比較它們，我們觀察到在早期由於不穩定的不確定性和不確定性項導致在訓練幾輪後選擇更多信息的實例，恆定加權聚類採樣優於 CLUE，不確定性加權聚類。直觀地說，我們引入了一個閾值來控制何時使用不確定性加權聚類或恆定加權聚類。我們將具有循環閾值的聚類不確定性加權嵌入稱為簡單解決方案，這是一種用於主動域適應的新標籤獲取策略。我們為理解不確定性的使用帶來了新的見解。實證結果表明，我們提出的技術實現了卓越的性能。

關鍵字

深度學習；領域自適應；主動學習

並列摘要

Domain adaptation is a technique that generalizes deep neural networks to new target domains under the domain shift. In practice, we can leverage labeled data from the target domain, but collecting fully labeled data can be quite expensive and time-consuming. To this end, we introduce active learning for domain adaptation to reduce the efforts of data annotation and to informatively annotate a small quota of target data that maximally benefits the DA model. This learning paradigm is known as Active Domain Adaptation (ADA). Prior work, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), the state-of-the-art, performs uncertainty-weighted clustering to identify target instances for labeling. In this work, We carefully study ADA-CLUE how to work with uncertainty and diversity. We proposed two variants of ADA-CLUE, Density Weighted Uncertainty Sampling (DWUS) and constant-weighted clustering sampling. DWUS clusters deep embeddings of target instances with constant weight and acquires labels for nearest-neighbor to centroids weighted by the corresponding uncertainty of the target model. By comparing them, we observe that constant-weighted clustering sampling is better than CLUE, uncertainty-weighted clustering, at the early stage due to unstable uncertainty and uncertainty term leads to select more informative instances after training some rounds. Intuitively, we introduce a threshold to control when to using uncertainty-weighted clustering or constant-weighted clustering. We called Clustering Uncertainty-weighted Embeddings with Loop Threshold as simple solution, a new label acquisition strategy for Active DA. We bring new insights to understanding the use of uncertainty. The empirical results demonstrate that our proposed technique achieve superior performance.

並列關鍵字

Deep Learning ； Domain Adaptation ； Active Learning

參考文獻

D. Arthur and S. Vassilvitskii. K-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’07, page 1027–1035, USA, 2007. Society for Industrial and Applied Mathematics.

Google Scholar

J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. CoRR, abs/1906.03671, 2019.

Google Scholar

S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning under covariate shift. Journal of Machine Learning Researh, 10(75):2137–2155, 2009.

Google Scholar

R. Chattopadhyay, W. Fan, I. Davidson, S. Panchanathan, and J. Ye. Joint transfer and batch-mode active learning. In S. Dasgupta and D. McAllester, editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 253–261, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR.

Google Scholar

Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. V. Gool. Domain adaptive faster R-CNN for object detection in the wild. CoRR, abs/1803.03243, 2018.

Google Scholar

國際替代計量

改良通過聚類不確定性加權嵌入的主動域適應

未授權

主題瀏覽