透過您的圖書館登入
IP:3.14.142.115
  • 學位論文

使用機率模型實行成本導向多重分類的主動學習演算法

Active Learning for Multiclass Cost-sensitive Classification Using Probabilistic Models

指導教授 : 林軒田

摘要


如何對成本導向(cost-sensitive)的多重分類法(multiclass classification)做主動學習(active learning)是一個相對較新的研究方向。對於這個問題,我們在這份論文中提出兩種專注於成本導向的主動學習策略: 最大預期成本(maximum expected cost)以及最小成本差距(cost-weighted minimum margin)。這兩種策略皆可以被視為是現存非成本導向(costinsensitive)策略的延伸。實驗結果顯示,在成本導向的環境下成本導 向的策略表現相當理想, 性能明顯超越非成本導向的策略。實驗結果中也反映出學習資料的難易度會若干影響成本導向主動學習演算法的表現。因此在實際的主動學習的應用中,根據分析資料特性來選擇主動學習的策略是較理想的做法。

並列摘要


Multiclass cost-sensitive active learning is a relatively new problem. In this thesis, we derive the maximum expected cost and cost-weighted minimum margin strategy for multiclass cost-sensitive active learning. These two strategies can be seem as the extended version of classical cost-insensitive active learning strategies. The experimental results demonstrate that the derived strategies are promising for cost-sensitive active learning. In particular, the cost-sensitive strategies outperform cost-insensitive ones on many benchmark data sets. The results also reveal how the hardness of data affects the performance of active learning strategies. Thus, in practical active learning applications, data analysis before strategy selection can be important.

參考文獻


Knowledge discovery and data mining, 2004.
Brigham Anderson and Andrew Moore. Active learning for hidden markov models: objective
on Machine learning, 2005.
L. Breiman. Random forests. Machine Learning, 2001.
C. C. Chang and C. J. Lin. LIBSVM: a library for support vector machines, 2001.

延伸閱讀