財務報表重編對財務報表使用者、資本市場和相關公司都有顯著的影響。本研究目的在於利用機器學習技術來對重編公司進行重編類別的預測。本研究為了避免各重編類別所導致的資料類別不平衡,因而採用兩種資料集。其後並用五種演算法進行類別之預測,其中K Nearest Neighbor(KNN)演算法在兩種資料集間有最佳的類別預測能力,分別為81.99%與78.29%。為了提高預測準確性,運用特徵選擇方法,相較於KNN,eXtreme Gradient Boosting(XGBoost)提高了6.48%的預測能力,達到83.90%的準確度,值得注意的是ReliefF較資訊增益(Information Gain)更能提升預測準確度。此外,特徵選取也挑選出九種財務報表重編風險變數,並且比較其與過去會計實證研究中風險變數的差異。
Financial restatements have significant implications for financial statement users, the capital market, and the companies involved. This study employs machine learning to classify restatements. In this research, two datasets are used to decrease the imbalance of data types caused by each restatement classifications. Then, five algorithms were used to predict the classifications, the K Nearest Neighbor (KNN) algorithm demonstrates superior prediction performance between the two datasets, achieving 81.99% and 78.29%, respectively. Utilizing feature selection to enhance prediction, eXtreme Gradient Boosting (XGBoost) increased prediction ability by 6.48% compared to KNN, achieving 83.90%, and it is worth noting that ReliefF proves more effective than Information Gain in improving accuracy. In addition, feature selection identified nine risk variables for restatement and compared them with risk variables from previous empirical accounting research.