Screening for cancer and advanced treatments have not only improved treatment outcomes and patient survival rates but also led to an increase in the number of diagnosed synchronous colorectal cancer (SCC) cases. This study used machine learning techniques to develop a predictive model including seven classification techniques (naive Bayes, logistic regression, K-Star, random committee, randomizable filtered classifier, random forests, and random tree) to identify the risk factors and clinical features of SCC. The clinical dataset comprised a total of 4,287 valid records and was obtained from three cancer registries. Fourteen independent variables were selected as risk factors to analyze the characteristics of SCC. Seven classification techniques were tested in this study by using Waikato software. Performance indicators were analyzed in terms of sensitivity, accuracy, specificity, F-measure score, and precision. The results of this study revealed that the most important risk factors of SCC are combined stage, tumor size, chemotherapy, and grade/differentiation. Among the classification techniques tested, the naive Bayes method revealed the highest accuracy (90.03%). Finally, chemotherapy for SCC is an important factor for future observation.
癌症篩查和先進治療方法不僅改善治療效果和患者生存率,也導致同時性大腸直腸癌(synchronous colorectal cancer, SCC)確診病例數增加。本研究使用機器學習技術發展同時性大腸直腸癌預測模型,以識別SCC的危險因子與臨床特徵。臨床研究數據來自三個癌症登記中心共計4,287筆有效紀錄。經由臨床醫師小組與文獻綜整共14項獨立變量作為候選危險因子。使用Waikato平臺驗證七種分類技術:貝氏分類器(naive Bayes)、邏輯迴歸(logistic regression)、K-近鄰星狀演算法(K-Star)、隨機成員法(random committee)、隨機過濾分類器(randomizable filtered classifier)、隨機森林(random forests)以及隨機樹(random tree)。預測模型藉由敏感性、準確性、特異性、F度量(F-measure)評分和精密度等指標進行績效分析。研究結果顯示:整併期別(combined stage)、腫瘤大小(tumor size)、化療(chemotherapy)、分級/分化(grade/differentiation)是SCC最重要的危險因子。