投資股市一直是投資者關心的議題,而從股票市場的歷史性資料可反映出未來的趨勢,且具有高度相關性,但是這些分析方式卻都忽略了消息面對短期股價的衝擊。儘管目前有許多網路新聞網站可供人瀏覽,但投資者無法每天為了投資股票而瀏覽大量的新聞文章,並轉化為對投資者本身有效的投資決策。在本研究中,結合技術分析(Technical Analysis, TA)與情感分析(Sentiment Analysis, SA)技術來對臺灣股票加權指數作預測,在技術指標(Technical Indices, TI)特徵集合方面根據技術分析可得知;情感分析可依據新聞文章訊息找出情緒字推估未來趨勢形成情感指標(Sentiment Indices, SI),兩指標可結合產生TI、TI+Seed、TI+PMI及TI+CE等特徵集合,經由支撐向量迴歸(Support Vector Regression, SVR)來學習特徵集合與每天臺灣大盤加權指數之間的關係來預測未來的股價,將可以降低測試期間之預測錯誤率。 經由實驗結果顯示,加入情感指標來計算股市新聞文章漲跌的特徵集合,可改善僅用技術指標作預測股價的錯誤率。使用點式交互資訊(Pointwise Mutual Information, PMI)及文脈熵模型(Contextual Entropy Model, CE)擴增方法產生的特徵集合優於以基礎字產生的特徵集合,此外,加入以強度來計算情感指標下的特徵集合,更可有效降低臺灣加權指數之預測錯誤率。
Invest in stocks has been a topic of concern to investors, while the historical data from the stock market may reflect the trend of the future, and has a high correlation, but these analytical methods ignored the news on the short-term stock price shocks. Although there are many Internet news websites for people to browse, but investors can not invest in stocks every day while browsing for a lot of news, and translated into effective investment decisions. In this paper, we will combine with technical analysis and sentiment analysis techniques for Taiwan weighted stock index prediction. Technical indicators(TI) feature set generated by technical analysis(TA). Sentiment analysis(SA) find out emotional words from news article on stock for Sentiment indicators(SI). Two indicators can be combined to produce TI, TI+Seed, TI+PMI and TI+CE, as well as support vector regression(SVR) prediction model learning with daily stock price to predict the relationship between future stock price, will be able to reduce the forecast error rate for test period. The experimental results show that adding emotion indicators to calculate the ups and downs of the stock market news articles of feature set can be improved only use technical indicators to predict the stock price of the error rate. Pointwise mutual information(PMI) and contextual entropy(CE) model amplification method produces of feature set superior feature set with seed words generator. In addition, incorporating the intensity of emotion indicators calculated of feature set, it can effectively reduce the Taiwan weighted stock index for prediction error rate.