透過您的圖書館登入
IP:3.141.168.112
  • 學位論文

以特徵感知的成本導向標籤嵌入法解決多標籤分類問題

Multi-label Classification with Feature-aware Cost-sensitive Label Embedding

指導教授 : 林軒田

摘要


多標籤分類問題是一個重要的機器學習問題,在此問題中每個樣本 點被標上多個標籤。在解決多標籤分類問題的方法之中,標籤嵌入法 是一系列重要的方法,它們透過抽取與利用標籤之間的潛藏結構來達 到更好的表現。在此系列的方法中,特徵感知的標籤嵌入法在抽取的 過程中同時考慮了特徵資訊和標籤資訊,並已展現出相較於沒有特徵 感知的標籤嵌入法更好的表現。儘管如此,現有的特徵感知標籤嵌入 法並沒有被設計成可以彈性的套用在不同的衡量標準上。在此論文中, 我們提出了一個嶄新的特徵感知標籤嵌入法,此方法會在訓練的過程 中考慮所要的衡量標準。我們將其命名為特徵感知的成本導向標籤嵌 入法,它以深度雙重網路將衡量標準編碼成嵌入向量之間的距離,並 透過一個同時考慮嵌入誤差與特徵至嵌入誤差的損失函數來達成特徵 感知。此外,特徵感知的成本導向標籤嵌入法還藉由附加位元法來處 理非對稱式衡量標準。橫跨不同資料集與衡量標準的實驗結果證明了 特徵感知的成本導向標籤嵌入法優於其他最先進的特徵感知標籤嵌入 法與成本導向標籤嵌入法。

並列摘要


Multi-label classification (MLC) is an important learning problem where each instance is annotated with multiple labels. Label embedding (LE) is an important family of methods for MLC that extracts and utilizes the latent structure of labels towards better performance. Within the family, feature- aware LE methods, which jointly consider the feature and label information during extraction, have been shown to reach better performance than feature- unaware ones. Nevertheless, current feature-aware LE methods are not de- signed to flexibly adapt to different evaluation criteria. In this work, we pro- pose a novel feature-aware LE method that takes the desired evaluation cri- terion into account during training. The method, named Feature-aware Cost- sensitive Label Embedding (FaCLE), encodes the criterion into the distance between embedded vectors with a deep Siamese network. The feature-aware characteristic of FaCLE is achieved with a loss function that jointly considers the embedding error and the feature-to-embedding error. Moreover, FaCLE is coupled with an additional-bit trick to deal with the possibly asymmetric criteria. Experiment results across different datasets and evaluation criteria demonstrate that FaCLE is superior to other state-of-the-art feature-aware LE methods and cost-sensitive LE methods.

參考文獻


[1] Changhu Wang, Shuicheng Yan, Lei Zhang, and Hong-Jiang Zhang. Multi-label sparse coding for automatic image annotation. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 1643–1650. IEEE, 2009.
[4] Timothy N Rubin, America Chambers, Padhraic Smyth, and Mark Steyvers. Sta- tistical topic models for multi-label document classification. Machine learning, 88(1):157–208, 2012.
[5] Grigorios Tsoumakas and Ioannis Katakis. Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 3(3), 2006.
[7] Farbound Tai and Hsuan-Tien Lin. Multilabel classification with principal label space transformation. Neural Computation, 24(9):2508–2542, 2012.
[9] Kuan-Hao Huang and Hsuan-Tien Lin. Cost-sensitive label embedding for multi- label classification. Machine Learning, 106(9-10):1725–1746, 2017.

延伸閱讀