以症狀為基利用分群化技術發掘病患罹患疾病

本研究以病患的診斷資料為探勘資料來源，每一筆診斷資料包含病患症狀與罹患疾病，以某一位病患症狀為探勘目標，利用資料探勘（data mining）中分群化（clustering）技術發掘病患罹患疾病傾向。文中先計算出病患症狀形成的子集合，假設子集合的個數為k個，k≥1，設定各子集合症狀為一群組的中心點，設計一個分群化診斷資料成k個群組的方法，且分群化後之k個群組的症狀相似度總和為最大，分別從各群組中找出累計出現次數最大的疾病項目，做為發掘病患症狀罹患疾病傾向的依據。分群化過程中除了保留PAM演算法的精神，取代原先中心點的診斷資料也能具備各子集合症狀的群組獨特性，且刪除未包含病患任一症狀的診斷資料，可提升後續分群化計算效率。文中根據提出的方法，設計與建置一個診斷病患罹患疾病探勘系統，本系統探勘結果，對一般民眾自我檢視症狀罹患疾病傾向、或是輔助臨床經驗不足醫療人員的疾病診斷，都可以提供非常有用的參考資訊。

關鍵字

資料探勘；分群化；症狀；疾病

並列摘要

This paper uses diagnostic data as the source of mining, and a diagnostic data contains a patient's symptoms and diseases. Clustering technique in data mining is used to analyze tendentiousness of a patient's diagnosed diseases. Let a patient's symptoms as the target of mining and assign each subset of the symptoms as a center point of a group. We present a clustering method to cluster diagnostic data to groups with the center point if their symptoms similarity are maximum, and the sum of the symptoms similarity of the groups after clustering is the largest. In addition to keep the spirit of the PAM algorithm in the process of clustering, replacing the diagnostic data of the original center points also have the groups uniqueness with the each subset of the symptoms. Deleting diagnostic data that do not contain any of the patient's symptoms can improve the efficiency of subsequent clustering computations. The most possible diagnosed diseases of the patient's symptoms are found from the group. According to the presented method, a mining system of diagnoses diseases for patients is designed and built. The results of mining can provide very useful information for self-diagnose diseases of people and diagnose diseases of inexperience hospital staffs.

並列關鍵字

data mining ； clustering ； symptom ； disease

國際替代計量

以症狀為基利用分群化技術發掘病患罹患疾病

全文下載

主題瀏覽