透過您的圖書館登入
IP:3.134.102.182
  • 學位論文

基於集成學習框架之信用違約預測-以信用卡客戶為例

The Credit Default Prediction Based on Ensemble Learning-The Case of Credit Card Customers

指導教授 : 江彌修

摘要


信用風險為金融機構最主要的風險來源之一,意指交易對手或借款者發生違約的風險。本研究基於Blending與Stacking集成學習框架,建構信用卡客戶違約風險預警模型,預測既有客戶未來違約的可能性,藉此在客戶發生違約行為之前,能先採取相關因應措施,並以單一模型之預測表現為基準進行比較。本研究以國內某大型銀行之信用卡客戶為研究對象,樣本資料期間為2005年4月至9月,包含信用卡持有人於這段期間的刷卡消費金額、付款金額、違約紀錄等交易相關資訊,與持有人之個人資訊。除了對原始資料進行資料前處理與特徵工程,本研究亦使用合成少數類過取樣技術 (SMOTE) 處理資料類別不平衡的情況。本研究採用實務上較適合評估信用風險的指標,如型二誤差、ROC曲線下方面積值 (AUC) 等,作為衡量模型成效的標準。實證結果顯示,相較於單一模型、以及Blending集成框架,經由Stacking集成框架所建構的模型在上述評估指標的衡量下之預測表現最好,驗證集成學習具有效提升模型成效的特性,但前提為在挑選集成框架中第一層分類器的模型時,必須考慮下列準則, (1) 各個模型間最好具差異性, (2) 各個模型的預測表現不能相差太大。

並列摘要


Credit risk is the risk of default on a debt that may arise from a borrower or counterparty failing to make required payment, which has been the main source of risk in most financial institutions. The purpose of this research is to construct an ensemble-learning-based credit risk model, especially based on Blending and Stacking approaches, for credit card default payment prediction. Financial institutions can take countermeasures to avoid losses due to existing customers with default payments, with the help of default alerts provided by our model. We also benchmark the performance of ensemble models against their base classifiers. This paper uses payment data in October, 2005, from an important bank in Taiwan and the targets are existing credit card holders of the bank. Our customer data include the amount of bill statement and previous payment, the past monthly payment records, and personal information etc. In addition to data preprocessing and feature engineering, we conduct Synthetic Minority Oversampling Technique (SMOTE) to deal with our imbalanced data. We use three evaluation metrics that are applicable to credit risk management in practice, such as Type II error, F_1-score, and the value of area under ROC curve, to evaluate the performance of these classification models. The results show that the classification model built based on Stacking approach outperforms base classifiers and Blending approach. The experimental evaluation also shows that ensemble learning has the potential to improve overall classification performance effectively under the premise of the base classifiers generated with high diversity and local accuracy.

參考文獻


中文文獻
[1] 林萍珍、柯博昌、游俊忠 (2010),演化式多重組合羅吉斯迴歸模型—應用於信用評等,資訊管理學報,第十七卷第二期,頁115-140。
[2] 林榮禾,陳奕昌 (2008),利用資料探勘技術建構整合型信用評等模型,國立臺北科技大學商業自動化與管理研究所碩士論文。
[3] 柯柏成、孫玉清 (2014),信用風險衡量模式之探討,證券櫃檯月刊170期,103年4月號,頁98-105。
[4] 洪智力,陳勁宏 (2007),破產預測選擇性集成模型比較,中原大學資訊管理學系會議論文。

延伸閱讀