以機器學習預測元大臺灣50的股價

隨著科技日新月異的進步、雲端網路的發展、計算速率的提升，使的人們越來越喜歡將科技與生活進行結合，打造一個數位化的時代。同時，股價的發展趨勢是投資人們極度想要了解的，藉由預測股價未來的漲跌，便能對股票標的進行買進或賣出，藉以獲得合理的報酬。從古至今，許多的學者認為股價服從隨機遊走，因此難以從過去的經驗中看出端倪。儘管如此，人們仍希望藉由強大的機器學習，模擬人腦的思考模式，對股價的漲跌進行預測。本論文中，決定以元大臺灣50為主軸，進行股價漲跌的研究，並且嘗試使用了羅吉斯回歸(Logistic Regression)、核支持向量機(Kernel SVM)、決策樹(Decision Tree)、隨機森林(Random Forest)、極限梯度提升(Extreme Gradient Boost, XGBoost)以及類神經網路(Artificial Neural Network)進行模型的訓練與測試，並結合特徵縮放(Feature Scaling)以及變數篩選(Variable Selection)，建構出不同的實驗情境，希望從中可以找到最佳的預測模型。發現在一般模型中，僅有極限梯度提升(XGBoost)的模型較為泛化，無過度擬合的問題。不論是一般模型，抑或是類神經網路模型，都有各自的準確率，針對股價預測上漲或下跌的目標，認為在羅吉斯回歸(Logistic Regression)，極限梯度提升(XGBoost)以及類神經網路(Artificial Neural Network)軍可以得到較好的預測效果。

關鍵字

股價預測；羅吉斯回歸；核支持向量機；決策樹；隨機森林；極限梯度提升；類神經網路

並列摘要

With the rapid advancement of technology, the development of cloud networks, and the enhancement of computing speed, people are more and more fond of combining life and technology. In addition, investors would like to know the trend of the stocks. They wish to earn the reasonable return by predicting the ups or downs of stock prices. Since ancient times, it is believed that stock prices follow random walks and there isn’t certain laws and trends, so it is difficult to see clues from past experience. Despite the phenomenon, we are still willing to simulate the thinking mode of our brains to predict the rise or fall of stock prices by using powerful machine learning. In this paper, I decide to use Yuanta Taiwan 50 as the underlying stock and take a research of its rise or fall in the stock prices. I try to use the models including logistic regression, kernel support vector machine, decision tree, random forest, extreme gradient boost and artificial neural network. Also, I combine these models and feature scaling (normalize and standardize) and variable selection (ridge and lasso) to construct different experimental scenarios, hoping to find out the best predictive model. It is found that the model with only XGBoost is more generalized and without overfitting. Whether it is a general model or a neural network model, it has its own accuracy rate. For the goal of predicting the rise or fall of stock prices, it is considered that in logistic regression, XGBoost and neural network can get better prediction effect.

並列關鍵字

Stock prediction ； Logistic regression ； Kernel support vector machine ； Decision tree ； Random forest ； Extreme gradient boost ； Artificial neural network