透過您的圖書館登入
IP:3.138.125.2
  • 學位論文

以資料探勘技術建立輔助乳癌診斷模型

Utilizing Data Mining Techniques to Construct Assisted Breast Cancer Diagnosis Model

指導教授 : 林榮禾
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


乳癌是全世界女性罹患最多的癌症之一,近年來由於醫學進步,若經提早的診斷發現及適當的治療,則乳癌的10年存活率平均達60%,第一期乳癌的存活率則高達80%以上,零期乳癌甚至接近100%。在科技進步及電腦處理普及的情況下,大量的醫院病患資料得以更快速的取得與分析,利用資料探勘方法將可以在短時間內進行預測及分類,提供醫生診斷之參考。 為建立一乳癌輔助診斷模型,利用倒傳遞類神經網路 (BPN)、決策樹 (C5.0)、貝氏網路 (BN)、支援向量機 (SVM)、邏輯式迴歸 (LR)、區別分析 (DA)、案例式推理 (CBR)、多元適應性雲形迴歸 (MARS) 對細針抽吸 (FNA) 乳房檢驗資料進行分析。建立的模型有單一乳癌診斷模型,包含單一診斷模型及再確認診斷模型;多重乳癌診斷模型,包含不一致診斷模型及投票組合診斷模型。另嘗試利用基因演算法 (GA) 在眾多方案中快速找出最佳解之乳癌診斷模型組合。結果顯示,再確認診斷模型、不一致診斷模型及投票組合診斷模型,皆優於單一診斷模型,其中投票組合模型表現為最佳,準確率達98.82%。而利用GA確實能減少建立模型所耗費的時間,找到最佳乳癌診斷之組合,在短時間內進行預測,提供醫生做為疾病診斷之參考,並提升診斷之準確性。

關鍵字

資料探勘 乳癌診斷

並列摘要


Breast cancer has been one of the most prevalent diseases for women around the world. Thanks to the advancement in medical treatment, approximately 60% of the patients with breast cancers are able to survive for ten more years with early diagnosis coupled with appropriate treatment. The survival rate for Stage-1 and Stage-0 breast cancer is over 80% and nearly 100% respectively. With the constant technological progress and ever-increasing reliance on computer, a huge amount of medical information of hospitalized patients can be easily acquired and effectively analyzed. Data mining method can be used to process and classify the information, providing valuable reference for doctors to reach more accurate diagnosis in an efficient manner. Striving to develop a solid diagnosis-supporting model focusing on breast cancers, the study uses BPN (Back Propagation Networks), C5.0, BN (Bayesian Networks), SVM (Support Vector Machines), LR (Logistic Regression), DA (Discriminant Analysis), CBR (Case Based Reasoning) and MARS (Multivariate Adaptive Regression Splines) to examine and classify the data obtained from breast FNA (Fine-Needle Aspiration) analyses. The breast cancer diagnosis models developed by the study include: the single diagnosis model (incorporating both diagnosis and reconfirmation) and the multi-combinational diagnosis model (including inconsistency-based model and voting model). In addition, GA (Genetic Algorithm) is used to identify the best combination of breast cancer diagnosis. Based on the research results, reconfirmation model, inconsistency-based model, and voting model are superior to a single diagnosis model. The voting model reports the best performance with an accuracy rate as high as 98.82%. Utilizing GA can effectively reduce the time spent on model construction, help identify the best combination of prediction models to facilitate efficient diagnosis of breast cancer, provide doctors with valuable reference, and to enhance the accuracy of diagnosis.

並列關鍵字

Data mining Breast cancer diagnosis

參考文獻


[7] 林惠文、顏啟華、應宗和,「乳癌的篩檢與診斷」,基層醫學,第二十一卷,第十二期,2006,第352-358頁。
[12] 莊惠程,以資料探勘技術對正異常口腔抹片影像之特徵選取與分類,碩士論文,中原大學醫學工程研究所,桃園,2006。
[21] Akerman, M., “Fine-needle aspiration cytology of soft tissue sarcoma: benefits and limitations,” Sarcoma, vol. 2, no. 3-4, 1998, pp. 155-161.
[2] Wisconsin Diagnostic Breast Cancer (WDBC),http://pages.cs.wisc.edu/~olvi/uwm
[10] 吳國禎,資料探索在醫學資料庫之應用,碩士論文,中原大學醫學工程學系,桃園,2000。

被引用紀錄


賴琴文(2015)。以資料探勘與模糊邏輯技術建置乳癌疾病診斷系統〔碩士論文,義守大學〕。華藝線上圖書館。https://doi.org/10.6343/ISU.2015.00010
秦聖昌(2015)。支援向量機於乳癌預測之研究〔碩士論文,國立中央大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0031-0412201512094103

延伸閱讀