國際癌症研究機構中所示提到,乳癌發生是女性罹患率高的癌症,所以針對乳癌診斷的判定為重要研究,癌症期別是對於惡性腫瘤(癌症)的狀況進行分類,不同類別及期別對於罹患者未來的存活率影響極大。研究目的為應用資料探勘的方法,結合電腦處理,建立乳癌多數決分析預測模式,在醫師診斷時間內提高乳癌中的診斷判定的準確度,由乳癌的診斷不同期別、及治療中的方法指引獲得資料,進行資料分析,模式提供資訊供醫生在臨床上判斷期別與治療時資訊,早期診斷治療之參考。研究收集資料來自Wisconsin Diagnostic Breast Cancer(WDBC)個案收集相關資料,使用多數分類應用其倒傳遞類神經網路(Back Propagation Network, BPN)、決策樹(Decision Trees, DT)中的C5.0演算法、支援向量機(Support Vector Machines, SVM)、邏輯式迴歸(Logistic Regression, LR)、案例式推理(Case Based Reasoning , CBR)、投票組合運用預測診斷模型、建立判定乳癌多數決模型最適的輔助系統評估,預測模式建模參考。
Breast cancer is a prevalent form of cancer among women, and its accurate diagnosis remains a challenge in clinical settings. This study aims to improve the accuracy of a breast cancer pathologic prediction model to provide better guidance for doctors in determining the stages and treatments of the disease. The Wisconsin Diagnostic Breast Cancer (WDBC) data used in this study contains features computed from a digitized image of a fine needle aspirate of a breast mass, which describes the characteristics of the cell nuclei present in the image. Several data exploration methods were used for majority classification, including the C5.0 algorithm in the Back Propagation Network, Decision Trees, Support Vector Machines, Logistic Regression, and Case Based Reasoning. This study also implemented a Combined Voting system as an auxiliary evaluation tool to determine the most appropriate breast cancer pathologic diagnosis model.