本研究旨在應用隨機森林模型進行重要性分析,排除不重要的輸入變數,並進行混合式機器學習來預測ETF。 本研究提出了一種混合式預測模型來預測金融科技型(FinTech)ETF的收盤價。本研究提出的混合式預測模型包含七個機器學習模型。 Diebold-Mariano檢定被用於確定所提之預測方法是否優於傳統方法。實驗結果顯示在輸入變數之重要性分數高於0.1及10%測試數據之情況時,所提出的的混合式KNN,Elastic Net和LASSO模型的預測準確率要優於沒有輸入變數選擇的模型。Diebold-Mariano檢定顯示所提出的混合式KNN模型表現最佳。 此外,本研究所提的混合式堆疊模型將七個模型集成為第一層結構,並將七個機器學習模型中的任一個集成為第二層結構,以提高預測精度。實驗結果顯示所提的混合式堆疊SVM機器學習模型優於其他模型。 本研究結果可幫助投資者制定投資決策,可最大程度地降低投資風險,並獲得相對穩定的金融科技型ETF之報酬。
This research aims to apply the random forests model to conduct importance analysis and exclude unimportant input variables and conduct hybrid Machine Learning to forecast Exchange Traded Funds (ETF). A hybrid forecasting model is proposed to predict the closing price of Financial Technology (FinTech) ETF. The proposed forecasting hybrid models consider seven machine learning models. The Diebold-Mariano test (DM test) is applied to determine whether forecasts have outperformance among the proposed method and the original methods. The experimental results show that the prediction accuracy rates with the proposed hybrid models of feature important scores based on higher than 0.1 and 10% data for the test are superior to those of without input feature selection for K-Nearest Neighbor (KNN), Elastic Net, and Least Absolute Shrinkage and Selection Operator (LASSO) models. The DM test shows that the proposed hybrid KNN model performs best. Further, a hybrid stacking model integrates the seven models as the first layer structure and one of the seven machine learning models as the second layer structure to increase the prediction accuracy. Overall, the proposed hybrid stacking Support Vector Machine (SVM) model outperforms the other models. This research results may help investors in making investment decisions, minimizing the investment risk, and reaching a relatively stable return of FinTech ETFs.