大腸癌罹患人數居100年全國癌症死亡原因排名第三位,其發生率雖高,若能早期治療則治癒率非常高。近年來由於國人生活及飲食習慣西化,導致大腸內易增生具有癌化風險的腺瘤息肉(adenomatous polyps)。腺瘤性息肉為最常見的腸內新生息肉,為大腸癌的前驅病灶,百分之九十以上的大腸癌是經由大腸腺瘤性息肉,經過10年以上的惡性演變而來。 本研究運用資料前處理並利用決策樹、邏輯斯迴歸等資料探勘方法,萃取大腸腺瘤息肉之分類規則來建構預測模型。結果顯示,重要影響因子為體重、年齡、BMI、性別、身高,本研究並萃取出五條簡易決策規則,以建構出大腸腺瘤息肉分類最佳預測模型。在效能評估方面,邏輯斯迴歸模型有77.3%準確率,本研究所建構之大腸腺瘤息肉分類探勘決策樹分類模型可達88.0%,且提升效能通過統計顯著性檢驗。本研究結果可提供醫院、民眾作為大腸癌預篩評估之參考,以期能夠早期發現早期治療。
People who with Colorectal cancer ranked third in the cause of cancer deaths referring to year 2011 nationwide statistics, its incidence is high, if early treatment, the cure rate is very high. In recent years,it is because people living and eating habits westernized, leading to the large intestine easily proliferated with cancerous risk of adenomatous polyps. Adenomatous polyps are the most common intestinal newborn polyps, the precursor lesions of colorectal cancer, more than 90 percent of Colorectal cancer are through more than 10 years malignant evolution of adenomatous polyps . This study, we use the data pre-processing and using decision trees, Logistic regression methods such as data mining classification rule extraction of colon adenomatous polyps to construct predictive models. The results show that the important factors as weight, age, BMI, gender, height, and extracted five simple decision rules to construct the best predictive model of colon adenomatous polyps classification. In terms of performance assessment, the logistic regression model has a 77.3% accuracy rate. In this study, colon adenomatous polyps classification decision tree classification model of up to 88.0%, and improve performance through statistical significance test. The results of this study provide hospitals, people as colorectal cancer pre-screening assessments of reference, with a view to early detection and early treatment.