運用機器學習方法推廣綜合券商大財管業務

本研究希冀以統計以及機器學習之方法找出統一證券財管三大業務（保代、海外市場、財富管理）之客戶特徵，及區分有無購買客群間之關鍵差異性質，使業務員得以用更有效率、有系統之方式辨別新開戶商品偏好，藉此提高開戶、成交而後達到提高營收之成效。研究流程從上述財管三大類業務的k-means分群開始，藉此完整建立大財管客戶之客戶素描以及兩年間（2018、2019）之特性轉變研究；第二步將未曾購買大財管商品之證券客戶與大財管客戶混和，使用分類演算法訓練模型，並探討模型中統計顯著之變項對客戶購買行為之影響，讓模型面對新開戶之特性能夠分辨其購買大財管商品之概率。在羅吉斯迴歸分類中，兩組模型顯著的變數相同，都是年齡、性別、註冊天數、開戶在2008前以及開戶在年終發放月份這五個變數，而購買總金額意外的並不顯著；從分類結論可以篩除潛在購買可能低的無效用戶，專注於潛在客戶，並依分群結論優化不同客群之服務，提高成交、營收。

關鍵字

證券；客戶分群；分類；機器學習； k mean ； logistic regression

並列摘要

This study is designed to use statistic and machine learning methods to find out the customer characteristics of President securities' three major financial management businesses (Insurance, Overseas, Wealth management), and to distinguish the key differences between the customer groups, so that salesperson can use this to identify the preferences of new customers in a more efficient and systematic way, thereby improving new comer rate and transaction rate, then achieving the goal of increasing revenue. The process starts with the k-means clustering of the three financial management business' customers, so as to establish the customer sketches of these businesses and research on the change in characteristics of customers during data time (2018, 2019). Second step, use Logistic classification to train the model in order to distinguish from buyers(1) to non-buyers(0), and discuss how the statistically significant features in the model impact on customer buying behavior, so that the model can tell the new comer buying probability in the future. In the outcome of Logistic classification, the significant variables of the two models are the same, which are Age, Gender, Days_of_registration, Registered_before_2008, Registered_at_bonus_release. And Total_buy_amount is interestingly not significant. In conclusion, it is possible to screen out users with low potential purchasing probability by focusing on potential customers, and optimize the services of different customer groups.

並列關鍵字

Stock Repurchase ； Capital reduction by cash ； Event Study ； Multiple Regression Analysis

參考文獻

Big Data Science Practice, (2015), Impact of target class proportions on accuracy of classification,https://kumaranpm.blogspot.com/2015/03/impact-of-target-class-proportions-on.html

Google Scholar

Building A Logistic Regression in Python, Step by Step,https://towardsdatascience.com/building-a-logistic-regression-in-pythonstep-by-step-becd4d56c9c8

Google Scholar

Jordan, J., (2017), Customer Segmentation Based on Purchasing Behavior, https://github.com/jeremyjordan/customersegmentation/blob/master/customer_segments.ipynb.

Google Scholar

Li, S., (2017), Building A Logistic Regression in Python, Step by Step.

Google Scholar

Prasad, G., (2019), Notes on K-prototype for clustering mixed typed data, https://medium.com/@guruprasad0o_o0/notes-on-k-prototype-for-clusteringmixed-typed-data-e80eb526b226

Google Scholar

主題瀏覽