應用資料探勘技術預測員工績效表現

近年來企業對於人力資源愈來愈重視，運用規劃、組織、領導及控制等管理功能，善加利用組織內人力資源，使其發揮績效，以達企業之經營目標。其中員工績效表現是影響企業表現以及讓企業在市場上持續保有競爭力的重要因素。本研究將利用資料探勘技術找出影響績效的關鍵因素，透過建立模型預測未來績效表現優異之員工。本研究所分析之資料類別不平衡，因此，採用四種資料平衡方法，包括 BorderlineSMOTE、SVM-SMOTE、Random Under Sampler、ADASYN，再利用決策樹、羅吉斯迴歸、隨機森林、支持向量機（SVM）、梯度提升決策樹，建立預測模型。實驗結果顯示資料平衡有助於提高模型表現，本研究希望能預測績效優異的員工，實驗中以 ADASYN 處理資料不平衡，並以 SVM 建構模型可得到最佳預測成效。此外，經過歸納找出重要特徵，影響績效表現預測之重要因素包括環境滿意度、婚姻狀況與月收入。

關鍵字

績效；決策樹；羅吉斯迴歸；隨機森林；支持向量機；梯度提升決策樹

並列摘要

In recent years, companies have paid more and more attention to human resources, and use planning, organization, leadership, and control management functions to make good use of human resources in the organization, in order to achieve companies’ business goals. Among them,employee performance affects the performance of companies, and is an important factor that allows companies to continue to remain competitive in the market. This research will use data mining techniques to identify key factors that affect employee performance,and establish models to predict future employees with excellent performance. In this study, the categories of the analyzed data set are imbalanced. Accordingly, four data balancing methods are used, namely Borderline-SMOTE, SVM-SMOTE, Random Under Sampler,and ADASYN. Then, decision trees, logistic regression, random forests, Support Vector Machine(SVM), and Gradient Boosting Decision Trees (GBDT) are used to build the prediction models. The experimental results show that the data balance helps to improve the performance of the model. The goal of this study is to discover employees with excellent performance. In the experiment, the best performance can be achieved by using the ADASYN to deal with the data imbalance, and using the SVM to construct the model. Moreover, this study identifies the key factors, including environmental satisfaction, marital status and monthly income, which affect the prediction effectiveness.