在順序型類別的分類是實務上很常見的問題,至今有許多專家學者提出針對順序型類別資料的分類器方法,包含許多統計常用的模型和近年來被廣為引用的資料探勘方法。但是,多數分類器所採用的統計模型必須滿足前提假設才能做配適,例如資料必須符合均質性、常態性和獨立性。另一方面,評估分類器所使用的績效評估指標也攸關到最後決定分類器的決策,不恰當的績效評估指標可能會導致最後選擇的分類器效果不佳。本文建議使用加權kappa係數來評估順序型分類器的績效,並利用實際的誤判成本計算出的非對稱權重矩陣做加權,既接近真實情況,又能考慮到預測類別和實際類別的一致性。本文也嘗試將統計常用的線性判別分析與資料探勘中的十個分類方法做比較,以找到分類績效較好的分類器。
There are many conventional methods to classify ordered data into some specific classes, including statistical methods and data mining. But the use of statistical methods should be based on the assumption of normality, independence and homogeneity. The thesis aims to compare classifiers built with linear discriminant analysis and ten methods of data mining, and found that the further is powerless. Moreover, the performance index used to compare classifiers is related to the precise of decision. An improper performance index may lead to wrong choice of classifiers. The performance index proposed in this thesis is improved from weighted Kappa and considered asymmetrically weighted cost matrix calculated by the cost of misclassification. The results show that the performance index proposed is more credible.