透過您的圖書館登入
IP:3.22.223.160
  • 學位論文

利用關聯式法則將中文文件分類

Classifying Chinese Text Documents by Association rule

指導教授 : 黃連進

摘要


利用改良式TFIDF公式計算每個特徵詞的權重,依據權重表可以計算出每份文件對各類別的權重值總和,同時利用關聯式法則採礦,找出同時會出現於一份文件中的特徵詞作為新的規則,統計新規則在訓練文件中各個類別出現的情形,依據每個規則之信賴度(confidence)及支持度(support)篩選出可以幫助分類的新規則,利用新規則修正文件的錯誤類別,以提升分類正確率。 本論文除利用改良式TFIDF弱化分布過廣之雜訊詞權重減少預處理時未刪減完全所帶來的影響,主要利用關聯式法則採礦出之新規則,並針對各種可能的情況篩選重覆性規則,依據信賴度遞減、規則長度遞減作為規則引用之排序準則以修正分類錯誤,並將分類類別調整先後順序,使分類的正確率提高。由本論文的實驗結果,在經過本論文提出的方法修正後,能夠大幅度提高文件分類的效率。

並列摘要


Use improved TFIDF to build weighting table. Thereby, the system computes the sum of weight of each document relative to each category. According to this way, we can classify the documents which haven’t been labeled. In this paper, we use improve TFIDF to calculate the keywords weight and then combine two words as a new word by association rule to help us increase the keywords. We exploit association rule technology to apply to the data mining miner. The features of weight table are input into the data mining miner and examined whether these rules sorted by confidence, support and the length of rule to save into rule base. It will make the classification more efficiency.

參考文獻


[3]G. Salton and C. Buckley, 1988, “Term weighting approaches in automatic text retrieval”, Information Processing and Management, vol. 24, No. 5, pp. 513-523.
[4] Hamill Karen A. and Zamora Antonio,1980, “The Use of Titles for Automatic Document Classification”,JASIS,V31,n6,pp396-402
[6]H D Navone, D Cook ,T Downs and D Chen, “Boosting Naive-Bayes classifiers to predict outcomes for hip prostheses, Neural Networks”, 1999. IJCNN '99. International Joint Conference on , Volume: 5 , 10-16 July 1999,pp3622 – 3626
[7]Hung-Ju Huang; Chun-Nan Hsu;, “Bayesian classification for data from the same unknown class: Systems, Man and Cybernetics”, Part B, IEEE Transactions on , Volume: 32 , Issue: 2 , April 2002,pp137-145
[10] J. Han, and M. Kamber,” Data Mining: Concepts and Techniques, Morgan Kaufmann”, 2000.

被引用紀錄


鄭哲明(2015)。應用資料探勘於顧客問題自動分類之研究 -以自來水公司民眾意見信箱為例〔碩士論文,國立交通大學〕。華藝線上圖書館。https://doi.org/10.6842/NCTU.2015.00091

延伸閱讀