透過您的圖書館登入
IP:18.188.44.223
  • 學位論文

以資料探勘技術對正異常口腔抹片影像之特徵選取與分類

The Use of Data Mining Techniques for Feature Selection and Classification in Normal and Abnormal Oral Smear Analysis

指導教授 : 蘇振隆

摘要


台灣特有的檳榔文化促使口腔癌發生率及死亡率長期居高不下。目前臨床的檢查仍以視診與觸診為主,往往使得病人就醫時病情已相當嚴重而延誤口腔癌診治時間。口腔抹片可為初期檢驗方法,未來更可定為口腔癌好發民眾的例行檢查,實現早期發現高達九成的治癒率。本研究主要目的為進行口腔抹片影像之參數量化分析,並實現正異常細胞的分類,以提供臨床醫師在抹片影像判讀上的參考依據。 方法上利用中值濾波及數學形態學進行改善,經由影像之前置處理分別取得彩色影像及灰階影像參數。在彩色影像部分,利用RGB與HIS模型計算正異常細胞之表現;而在灰階影像部分,分別計算外型參數包含細胞核與細胞質面積、核質比,及利用灰階伴隨矩陣求得紋理參數。最後利用基因演算法粹取具參考價值的特徵,將粹取的參數進行資料探勘分析,分別以決策樹與貝式分類法實現正異常細胞之分類,並評估兩大分類法與不同特徵參數集的診斷效能。 結果顯示,針對正異常口腔抹片之辨識,貝式分類法具有91%正確率的診斷能力,明顯優於決策樹 C4.5學習演算法具備的87.2%診斷正確率,但決策樹對結果的闡述能力與可理解度卻優於貝式分類法。在不同特徵群組的診斷能力評估下,由P value < 0.001與基因演算法萃取所產生的特徵群組表現較為優異,在訓練樣本中分別為97.8%與98%的診斷正確率,而在測試樣本中則分別為91.2%與95.75的診斷正確率。在P value < 0.05特徵群組中,其訓練與測試樣本中診斷正確率分別僅為79.8%與69.5%,經彙整後可獲得包含細胞核面積、核質比、灰階標準差、熵值、能量值、均勻度、紅色最小值、紅色平均值與影像強度總計九項具診斷價值的特徵參數。 整體而言,本系統的完成可得到正異常口腔細胞進行參數量化分析,並能實現正異常細胞的分類,且由本研究所引用的探勘技術針對日後結合醫學領域應用也作一系列探討與評估,在具備診斷價值的特徵參數挖掘上,也將提供臨床醫師在抹片影像判讀上的參考,且因著本研究所建立的診斷知識規則建立,對未來於抹片判讀系統開發上提供了診斷規則建立的參考依據。

並列摘要


The percentage of oral cancer occurrence and death are high for a long time due to specific chewing habit in Taiwan. However, curing efficiency is reached to 90% in early detection of oral disease currently. Therefore, oral smear screening is quite an early diagnosis and could be routine examining to improve the current clinical diagnosis just including visually look and touch the mouth that may delay curing time. The objective of this research is to use oral smear image for quantitative analysis and classification in normal and abnormal smears to offer the rule-based of smear identification to physician. First of all, median filter and Mathematical Morphology were utilized to modify the image quality. Then we extract color and gray feature sets via image pre-processing. In color images, RGB and HSI model was applied to perform the difference between normal and abnormal oral smear. The other side, calculating the pattern feature includes nucleus and cytoplasm area and the ratio of nucleus area to cytoplasm area (N/C Ratio), then exploiting co-occurrence matrix to analyze the texture of the images. Finally, we extract valuable feature sets by Genetic Algorithm (GA) and analyze those ones using data mining such as classifying normal and abnormal cell by Decision Tree (DT) and Bayesian Classifier (BC) respectively. We also evaluate the diagnostic efficiency from both classifiers and different feature sets in the end. The result in the normal and abnormal oral cell recognition shows that there are 91% accuracy of diagnosis by BC and 87.2% via C4.5 learning algorithm of DT. For the ability of diagnosis, BC has better performance than DT; however, the DT was more able to explain diagnosis knowledge and rule than BC. Evaluating the diagnosis efficiency from lots of feature sets, there are 97.8% and 98% accuracy in the training set of two groups derived from p<0.001 and GA, 91.2% and 95.75% accuracy in the testing set respectively, hence the both have good diagnostic behavior. Otherwise, in the feature set derived from P < 0.05, which contains just 79.8% accuracy in training set and 69.5% accuracy in testing set for diagnosis. According to the research, we obtain nine valuable diagnostic features including nucleus area, N/C ratio, standard deviation of gray level, entropy, energy, homogeneity, minimum value of red, average value of red and image intensity. Overall, the complete system could make users do quantitative analysis and realize classification between normal and abnormal cell. We also discuss and evaluate data mining techniques involved in the field about medical application and diagnostic system. Because of the feature mining and knowledge of diagnostic rule established, it gives not only the basis in smear image identification clinically, but proposes the rule-based for oral smear verification system development in the future.

參考文獻


[1] YC Ko, YL Huang, CH Lee, MJ Chen, LM Lin and CC Tsai : Betel quid chewing, cigarette smoking and alcohol consumption related to oral cancer in Taiwan. J Oral Pathol Med. 1995, vol.24, pp. 450-453.
[2] 癌症登記年報,行政院衛生署91年。
http://crs.cph.ntu.edu.tw/crs_c/annual.html。
[3] 口腔癌之治療共識,國家衛生研究院出版品,民87
[4] PB Sugerman and NW Savage : Exfoliative cytology in clinical oral pathology. Aust Dent J, vol.41, pp.71-4, 1996.

被引用紀錄


陳佳男(2007)。以脈波和良導絡特徵參數建構 之高血壓辨析系統〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200700939
張雅婷(2008)。以資料探勘技術建立輔助乳癌診斷模型〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-1208200813004000

延伸閱讀