通過一致性正規化從標籤比例中學習

從標籤比例學習 (Learning from Label proportions) 的問題涉及使用一袋 (bag) 資料對應的弱標籤訓練分類器，而不是在單個資料上帶有強標籤。弱標籤僅包含每袋資料的標籤比例。從標籤比例中學習問題對於許多實際應用非常重要，尤其是數據隱私或註釋成本，這些實際應用僅允許收集標籤比例，並且最近受到了很多研究關注。現有的大多數方法都集中在擴展監督學習模型，以解決從標籤比例中學習問題，然而弱標籤的本質很難從監督學習角度進一步提高分類性能。因此在本文中，我們從半監督學習的角度提出新的方法。更進一步說，我們提出了一種受到一致性正規化 (consistency regularization) 啟發的新穎模型，一致性正規化是半監督式學習中的一種熱門概念，它鼓勵模型產生可以更好地描述資料流形的決策邊界。隨著一致性正規化的引入，我們進一步將研究擴展到了更符合實際需求的情形，更透過實驗顯示參數選擇過程可以只依賴標籤比例。實驗不僅證明通過一致性正規化從標籤比例中學習具有出色的性能，而且還證明了所提出方法的實際可用性。

關鍵字

從標籤比例學習；一致性正規化；半監督式學習

並列摘要

The problem of learning from label proportions (LLP) involves training classifiers with weak labels on bags of instances, rather than strong labels on individual instances. The weak labels only contain the label proportion of each bag. The LLP problem is important for many practical applications that only allow label proportions to be collected because of data privacy or annotation cost, and has recently received lots of research attention. Most existing works focus on extending supervised learning models to solve the LLP problem, but the weak learning nature makes it hard to further improve LLP performance with a supervised angle. In this paper, we take a different angle from semi-supervised learning. In particular, we propose a novel model inspired by consistency regularization, a popular concept in semi-supervised learning that encourages the model to produce a decision boundary that better describes the data manifold. With the introduction of consistency regularization, we further extend our study to non-uniform bag-generation and validation-based parameter-selection procedures that better match practical needs. Experiments not only justify that LLP with consistency regularization achieves superior performance, but also demonstrate the practical usability of the proposed procedures.

並列關鍵字

Learning from Label Proportions ； Consistency Regularization ； Semi-supervised Learning

參考文獻

[1] Ehsan Mohammady Ardehaly and Aron Culotta. Co-training for demographic classification using deep learning from label proportions. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pages 1017–1024. IEEE, 2017.

Google Scholar

[2] David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin Raffel. Mixmatch: A holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249, 2019.

Google Scholar

[3] Gerda Bortsova, Florian Dubost, Silas Ørting, Ioannis Katramados, Laurens Hogeweg, Laura Thomsen, Mathilde Wille, and Marleen de Bruijne. Deep learning from label proportions for emphysema quantification. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 768–776. Springer, 2018.

Google Scholar

[4] Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural

Google Scholar

Networks, 20(3):542–542, 2009.

Google Scholar

國際替代計量

通過一致性正規化從標籤比例中學習

全文下載

主題瀏覽