多囊性卵巢症(PolyCystic Ovary Syndrome,簡稱PCOS),占生育年齡婦女當中的百分之五到百分之十,全台有23萬到45萬婦女受多囊性卵巢症的纏身。PCOS起因於內分泌失調,使得卵子發育不完全,導致女性經期失序,且PCOS患者也常併發第二型糖尿病與心血管疾病,這對生活品質會有相當大的影響。多PCOS致病因素還沒有確切答案,且治療方式也僅是對症下藥,始終無法根治。因此,本研究期望應用資料探勘(Data Mining)技術的決策樹演算法(C4.5)與關聯分析演算法(CBA),結合與馬偕醫院生殖醫學中心所提供純本土PCOS患者資料,透過分析運算,從中了解造成PCOS的因素關係,從大量的檢驗資料尋找出有意義的屬性規則與其中隱藏的有用知識,並將資訊提供給醫療團隊作為診斷的決策,以更精確的提供患者適當的療程,也避免不必要的醫療資源浪費。 本研究將針對台灣本土PCOS患者資料進行資料探勘分析之研究,在實驗的結果中得到令人意想不到的結果,AMH是PCOS血液樣本診斷中最重要的因素,{ AMH, Testosterone }具有非常非常高的準確率。最後,透過資料探勘技術產生的決策結果提供台灣優秀醫生有快速診斷且提供預測協助。
Polycystic ovary syndrome (PCOS) is one of the most common female endocrine disorders. In Taiwan, There are 23 million to 45 million women affected by PCOS. This is a complex, heterogeneous disorder of uncertain etiology. PCOS is also one of the major problems of infertility. We examine one such PCOS data, showing a method of applying data mining techniques, and our experiment, and results. The PCOS data is form a northern regional teaching hospital in Taiwan with 499 women. We used the decision tree algorithm in C4.5 and the Classification Base on Association (CBA) with a target variable which women is (or is not) PCOS and 7 predictors: Age, AMH, BMI, Testosterone, AC Sugar, AC Insulin, PC Sugar. Unexpectedly, the most important variable associated with PCOS is AMH, and the most important rule is {AMH, Testosterone}. Data mining can discover novel associations that are useful to clinicians and administrators.