透過您的圖書館登入
IP:216.73.216.17
  • 學位論文

集成學習在信貸不平衡資料上之應用

Application of Ensemble Learning in Imbalanced Personal Credit Data

指導教授 : 陳景祥
共同指導教授 : 李百靈(Pai-Ling Li)

摘要


現今的社會上信用卡在消費習慣上佔有極大的比例,對銀行而言信用卡帶來龐大的商機,同時提升客戶違約的風險並造成重大損失,而違約客戶僅占整體客戶中的少數且不易察覺,屬於不平衡資料的範疇。本研究使用不同重抽樣方法將資料結構做處理,並利用集成學習方法結合機器學習的羅吉斯回歸、支持向量機、隨機森林、極限梯度提升4種模型來尋找潛在違約客戶,藉此降低損失成本,透過合適的模型評估指標比較不同重抽樣方法結合集成學習模型在不平衡資料上的表現,並探討集成學習模型在不平衡資料上之應用情況。

並列摘要


Credit cards information hold a large proportion of consumption habits. For banks, credit cards bring a lot of benefits on business. However, it increases the risk of customer defaults and cause huge losses at the same time. The default customers are minority of the whole data, which is not easy to predict and it belongs to the field of imbalanced data. This study uses different resampling methods for processing the data structure, and uses methods of ensemble learning combined with machine learning algorithms for predicting potential default customers, including logistic regression, support vector machine, random forest, and extreme gradient boosting. Accordingly, bank can keep the cost down. We compare the performance of different resampling methods with the model of ensemble learning through some appropriate evaluation indexes, and discuss the application of ensemble learning in imbalanced data.

並列關鍵字

Imbalanced Data Resampling Ensemble Learning Bagging Stacking

參考文獻


[1]Brown, I., & Mues, C., 2012. An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications., 39, 3446-3453.
[2]Veganzones, D., & Séverin, E., 2018. An investigation of bankruptcy prediction in imbalanced datasets. Decision Support Systems., 112, 111-124.
[3]Neema, S., & Soibam, B., 2017. The comparison of machine learning methods to achieve most cost-effective prediction for credit card default. Journal of Management Science and Business Intelligence., 2, 36-41.
[4]Liu, Y., Cheng, J., Yan, C., Wu, X., & Chen, F., 2015. Research on the Matthews Correlation Coefficients Metrics of Personalized Recommendation Algorithm Evaluation. International Journal of Hybrid Information Technology., 8, 163-172.
[5]Lunardon, N., Menardi, G., & Torelli, N., 2014. ROSE: a Package for Binary Imbalanced Learning. R Journal., 6, 79-89.

延伸閱讀