在這篇論文中我們提出一個對於醫學資料探勘的新的評估標準名為「在完美敏感度下的專一性」。這種方法專注於評估一個分類的模型能夠確認其分類為negative資料的完美性的程度。在醫學資料探勘中對於false negative的資料的代價極高導致不應允許任何的false negative資料存在,針對這樣的情況我們這為這個評估標準是非常有用的。我們更進一步提出兩種策略來提高這個評估標準。第一種策略稱為可疑性擴展法(suspicion expansion),利用放寬對positive的定義以將靠近positive資料的資料點定義為可疑性的(suspicious)資料來增強我們對negative資料的信心。第二種策略稱為容忍偽陽性法(false positive tolerance),藉由容忍一些positive病人中誤判的negative資料點來降低病人的false negative。實驗結果顯示相較於傳統的分類器,我們的方法可以使分類器在完美敏感度下的專一性有明顯的進步。
In this thesis we purpose a novel evaluation criterion “specificity under perfect sensitivity” for medical data mining. This criterion aims at assessing the effectiveness of a classification model in confirming the perfection of the predicted negative data. We argue that this criterion could be useful for medical data mining when the penalty for false negative is extremely high so that no any false negative should be allowed. We further purpose two strategies to assist a classifier to obtain higher SUPS. The first strategy tries to loosen the criterion of positive by assigning negative instances closer to a positive one as suspicious, in order to enhance the confidence of predicted negative data. The second one tolerates the misclassified negative instances of positive patients to reduce the false negative rate of patients. The experiment results show that our methods can improve SUPS significantly comparing to the original classifiers.