在真實世界中,不同的多標籤問題往往需要不同的衡量標準,因此,將衡量標準考量進演算法中成為了一項重要的課題。我們將此種問題稱為成本導向多標籤分類問題 (cost-sensitive multi-label classification)。大部分現有的方法無法處理任意的衡量標準,而其他成本導向的方法卻又有過高的時間複雜度。在此研究中,我們提出漸進隨機標籤集 (progressive random k-labelsets) 演算法以解決上述兩個問題。此演算法延伸自著名的隨機標籤集 (random k-labelsets) 演算法,因此具有與之相同的效率。此外,此方法逐步而漸進地將原始問題轉化為一系列的成本導向多元分類問題 (cost-sensitive multi-class classification),並能處理普遍的衡量標準。實驗結果顯示,與其他特別為某些衡量標準設計的演算法相比,漸進隨機標籤集演算法的表現與之不相上下。而在其他衡量標準下,我們提出的方法顯著地優於其他方法。
Many real-world applications of multi-label classification come with different performance evaluation criteria. It is thus important to design general multi-label classification methods that can flexibly take different criteria into account. Such methods tackle the problem of cost-sensitive multi-label classification (CSMLC). Most existing CSMLC methods either suffer from high computational complexity or focus on only certain specific criteria. In this work, we propose a novel CSMLC method, named progressive random k-labelsets (PRAKEL), to resolve the two issues above. The method is extended from a popular multi-label classification method, random k-labelsets, and hence inherits its efficiency. Furthermore, the proposed method can handle general evaluation criteria by progressively transforming the CSMLC problem into a series of cost-sensitive multi-class classification problems. Experimental results demonstrate that PRAKEL is competitive with existing methods under the specific criteria they can optimize, and is superior under general criteria.