透過您的圖書館登入
IP:18.119.136.235
  • 學位論文

使用角色分類結合文字探勘技術建立股市預測模型

Using Text Mining Technologies for a Role-Based Stock Prediction Model

指導教授 : 洪智力

摘要


預測股價一直是熱門的研究議題,但往往缺乏與時事的結合,如果能透過每天記錄真實發生事件,結合人工智慧機制,輔助投資者選擇較好的趨勢參考,讓投資者的報酬率極大化。在過去研究,使用文字探勘運用於預測股市消息面,僅探討每篇股市新聞與出現關鍵字詞的頻率關係。雖然同樣的新聞發佈對於每位投資者的職業、背景和讀後決策不盡相同,並無法反應真實事件下的投資者判讀消息面經驗。若能針對新聞內容與投資者讀後決策加以分析,便能從中了解股市新聞內容對於投資者判斷之影響。 本研究將以股市新聞為初始點,以使用文字探勘技術與結合投資者判讀經驗 ,利用人工智慧的支援向量機(support vector machine, SVM)輔助建立具有角色經驗的股市預測模型。在眾多投資者中,如何找出不同的投資者經驗與股價關係,本研究將投資者角色劃分為三種類別 (1)金融相關工作者、(2)一般上班族、(3)非上班族,透過本研究歸類的職業分群機制,依照每位投資者職業的類別,找尋預測準確性高的投資者角色族群,並把投資者經驗加入至預測模型。 實驗結果顯示任何漲跌幅區間下,三種角色經驗模型皆比Logistic Regression與傳統預測模型準確率還要佳,證明加入角色經驗能夠提升以往傳統文字探勘運用於股市消息面研究的預測準確度。 因此,本研究提出使用角色分類結合文字探勘技術建立股市預測模型,提供新的結合文字探勘觀念式架構,建立具有投資者經驗的預測模型。

並列摘要


Projecting the trend of stocks has always been popular among researchers and practitioners, but it often lacks integration with the presence. However, through daily recordings of current events in combination with artificial intelligence systems will enable stock forecasting models to provide investors with better predictions that will assist them in making the best decisions for their investments. Unlike the works of previous research where the use of text mining applied only to exploring useful knowledge from the stock news, we want to introduce a conceptual model which highlights the different effects caused by investors with different vocations. Our proposed model is an extended one that modifies the original stock news mining technique by analyzing the relationship between stock news and opinion from readers who have different vocations. Our research began with integrating the text mining technique with investors from different backgrounds and work experiences, capitalizing on the use of support vector machines to create a stock prediction model with characterization abilities. To effectively analyze the relationship of influences from investors with different vocations, we propose to categorize them into three different classifications: (1) Finance/banking, (2) General office workers, (3) non office workers. Through these classifications, we locate the group with the highest prediction accuracy and add their experiences into our proposed stock forecasting model. The results showed the accuracy of the three roles in forecasting model was better than the logistic model and the traditional model in any interval. It can promote forecasting accuracy for traditional text mining applied to the stock market news in further research. Therefore, our research proposes using the experience of different occupations combined with text mining techniques to establish an extended stock forecasting model, thereby introducing a new framework for increased precision.

參考文獻


許中川,陳景揆(2001). 探勘中文新聞文件,資訊管理學報,7卷,2期,頁103-122。
Chen, K.J., and Ma, W.Y. (2002). Unknown Word Extraction for Chinese Documents. In Proceedings of COLING 2002, pp.169-175.
Cortes, C., and Vapnik, V. (1995). Support Vector Networks. Machine Learning, Vol. 20, No.3, pp.273-297
Fawcett, T., and Provost, F. (1999). Activity Monitoring: Noticing Interesting Changes in Behavior. Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.53-62.
Fung G.P.C., Yu, J.X., and Lam, W., (2003). Stock Prediction: Integrating Text Mining Approach Using Real-time News. Proceedings of the 7th IEEE International Conference on Computational Intelligence for Financial Engineering (CIFEr), pp.395–402.

被引用紀錄


吳裘莉(2015)。我國兩大報紙國際新聞的國際觀〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2015.10206

延伸閱讀