透過您的圖書館登入
IP:18.218.129.100
  • 學位論文

應用遷移學習與文字探勘分析致股東報告書

Application of Transfer learning and Text Mining on Reports to Shareholders

指導教授 : 林嬋娟
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本研究先以自然語言處理方法中的BERT (Bidirectional Encoder Representation from Transformers) 建立文字探勘模型,並利用致股東報告書對BERT進行微調 (fine-tuning)。接著探討BERT是否解決過往文字探勘方法的缺點,最後以情緒分析 (Sentiment Analysis) 剖析致股東報告書的語調,研究致股東報告書語調對於公司未來績效的影響。 實證結果顯示,致股東報告書須針對中英夾雜問題做前處理,而經過驗證資料集表現篩選超參數 (hyperparameter) 後,BERT模型分類準確率高達86%。經過視覺化BERT模型的運作,發現其能捕捉否定詞修飾的詞彙,且同樣能成功捕捉形容詞所修飾的名詞。語境測試結果顯示,將文字順序隨機打亂後,BERT表現大幅下滑,因此可知BERT確實有學習到語言結構。 然而關於語調對公司未來績效的影響,從實證結果發現,當年(t)的致股東報告書情緒對隔年(t+1)的盈餘並無顯著影響,推論原因可能是樣本篩選不夠具代表性,或是台灣致股東報書本身與美國的MD A資訊含量有差異,導致台灣的致股東報告書與未來盈餘並無呈顯著關聯。

並列摘要


First, this study applies BERT (Bidirectional Encoder Representation from Transformers) to construct a text mining model, and uses Report to Shareholders to fine-tune BERT. Next, we will discuss whether BERT can overcome some weaknesses of traditional text mining techniques. Finally, this study tries to assess the impact of the tone in Report to Shareholders on company’s future performance by using Sentiment Analysis. The empirical result shows that the problem of mixing Chinese and English in Report to Shareholders must be tackled, and after choosing the best hyperparameter based on validation performance, the classification accuracy reaches up to 86%. By visualizing the operation of BERT, we find that BERT can not only capture the relation between the word and its negation, but also capture the relation between the adjective and the noun successfully. The result from the context test also shows that the performance of BERT drop significantly after the text sequence is randomly shuffled, so it is considered that the language structure of Chinese is learned by BERT. However, regarding to the impact of tone in Report to Shareholders on the company’s future performance, the empirical result shows that the sentiment in Report to Shareholders has no significant impact on the next year’s earnings. The results suggest that the sample may not be representative enough or Taiwan’s Report to Shareholders has less information values than the US’s MD A information content so that there is no significant relation between the tone and the next year’s earnings.

參考文獻


Antweiler, W., and Murray, F. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.
Biddle, G. C., Hilary, G. Verdi, R. S. (2009). How does financial reporting quality relate to investment efficiency? Journal of Accounting and Economics, 48, 112–131.
Bochkay, K., Levine, C. B. (2019). Using MD A to improve earnings forecasts. Journal of Accounting, Auditing Finance, 34(3), 458-482. doi:10.1177/0148558X17722919.
Bryan, S. H. (1997). Incremental information content of required disclosures contained in management discussion and analysis. The Accounting Review, 72(2), 285-301.
Campbell, J. L., Chen, H., Dhaliwal, D. S., Lu, H., Steele, L. B. (2014). The information content of mandatory risk factor disclosures in corporate filings. Review of Accounting Studies, 19, 396-455. doi:10.1007/s11142-013-9258-3.

延伸閱讀