由醫療資料庫發掘有意義之模糊關聯規則

本研究將提出一種適用於醫療資料庫探勘之四階段作業程序，以改善現有關聯規則(association rule)資料探勘研究中常見，如所發掘之關聯規則語意不清晰、關聯規則重複，以及因傳統關聯規則「支持度＼信賴度」機制的限制，造成遺失有意義的規則等問題。為使發掘之關聯規則語意清晰，本研究首先運用叢集劃分(cluster partitioning)技術，自動將資料表格中數值資料(quantitative data)的資料欄位，轉換成為口語化述辭(linguistically terms)形式的模糊集合，其後使用自我組織映射圖網路(SOM, self-organizing maps)叢集分析法，依據敏感度分析(sensitivity analysis)所獲得之相對重要資料欄位，以及資料本身特徵，將所有資料區分為數個內部資料特徵相似的叢集，並對各叢集進行關聯規則分析，其後並以模糊相似關聯(fuzzy resemblance relation)概念設計之演算法，將語意近似之重覆關聯規則加以合併。藉由關聯規則之合併，可有效減少發掘關聯規則之數量，且所保留之關聯規則更具資訊表達之完整性(informative)，且更易於醫療領域之解釋及運用。另為判斷關聯規則之可信度，本研究並運用模糊資料庫(fuzzy database)中真實值(truth value)評量方法，保留具較高真實度之關聯規則。最後，我們並使用一真實的疾病醫療資料庫驗證本研究提出的作法。

關鍵字

資料探勘；叢集劃分；自我組織映射圖網路；模糊關聯規則；模糊重組關聯；真實值

並列摘要

For data mining applications, association rule can be used to support a decision making process. However, association rule algorithms usually yield a large numbers of rules, and many of the rules may contain redundant, irrelevant information or describe trivial knowledge. In this paper we present a four-stage data mining processes for finding relevant fuzzy association rules from medical database. Fuzzy association rules are especially suitable in medical mining, since they consist of simple linguistically interpretable rules and do not have the drawbacks of symbolic or crisp association rule. In the first phase, the Cluster partitioning technique was used to automatically transform quantitative values into fuzzy linguistically terms. The linguistically terms were modeled by means of fuzzy sets defined in the appropriate attribute domains. Next, a Kohonen self-organizing map (SOM) was used to identify clusters based on shared feature attribute values. The resulting clusters were then classified by feature attributes determined using an Apriori association rule algorithm. Because the association rule algorithm tended to generate large numbers of rules, we present interactive strategies for pruning redundant association rules on the basis of fuzzy resemblance relation to enhance its readability, and evaluate the truth degree of the discovered fuzzy association rules by the truth evaluation mechanism. Finally, we demonstrate our approach on a real disease medical database.

並列關鍵字

Data mining ； cluster partitioning ； self-organizing map SOM ； fuzzy association rule ； fuzzy resemblance relation ； truth value

參考文獻

Agrawal R.,Imielinski T.,Swami A.(1993).Mining association rules between sets of items in large databases.(ACM SIGMOD International Conference).

Google Scholar

Bastide Y.,Pasquier N.,Taouil R.,Stumme G.,Lakhal L.(2000).Mining minimal non-redundant association rules using frequent closed item sets.Lecture Notes In Computer Science.1861

Google Scholar

Baysrdo R. J.,Agrawal R.(1999).Mining the most interesting rules.(Proc. KDD Conference).

Google Scholar

Brin S.,Motwani R.,Silversterin C.(1997).Beyond market baskets: Generalizing association rules to correlation.(Proc. SIGMOD conference).

Google Scholar

Chaea Y. M.,Kima H. S.,Tarkb K. C.,Parkb H. J.,Hoa S. H.(2003).Analysis of healthcare quality indicator using data mining and decision support system.Expert Systems with Applications.24,167-172.

Google Scholar

被引用紀錄

陳俊旗（2013）。利用關聯演算法重現決策樹分類結果〔博士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2013.00144

楊欣明（2009）。資料探勘在健康檢查後續追蹤之應用〔碩士論文，國立屏東科技大學〕。華藝線上圖書館。https://doi.org/10.6346/NPUST.2009.00237

Hu, Y. C., Lin, J. Y., & Lin, A. (2013). Analyzing Investment Regions in Mainland China for Taiwanese Firms by Association Rule Mining. Asia Pacific Management Review, 18(2), 143-160. https://doi.org/10.6126/APMR.2013.18.2.02

國際替代計量

由醫療資料庫發掘有意義之模糊關聯規則

全文下載

主題瀏覽