應用混合式機器學習於金融科技型ETF價格之預測

本研究旨在應用隨機森林模型進行重要性分析，排除不重要的輸入變數，並進行混合式機器學習來預測ETF。本研究提出了一種混合式預測模型來預測金融科技型（FinTech）ETF的收盤價。本研究提出的混合式預測模型包含七個機器學習模型。 Diebold-Mariano檢定被用於確定所提之預測方法是否優於傳統方法。實驗結果顯示在輸入變數之重要性分數高於0.1及10％測試數據之情況時，所提出的的混合式KNN，Elastic Net和LASSO模型的預測準確率要優於沒有輸入變數選擇的模型。Diebold-Mariano檢定顯示所提出的混合式KNN模型表現最佳。此外，本研究所提的混合式堆疊模型將七個模型集成為第一層結構，並將七個機器學習模型中的任一個集成為第二層結構，以提高預測精度。實驗結果顯示所提的混合式堆疊SVM機器學習模型優於其他模型。本研究結果可幫助投資者制定投資決策，可最大程度地降低投資風險，並獲得相對穩定的金融科技型ETF之報酬。

關鍵字

機器學習；金融科技型ETF ；混合式預測模型；隨機森林模型；重要性分析；混合式堆疊模型

並列摘要

This research aims to apply the random forests model to conduct importance analysis and exclude unimportant input variables and conduct hybrid Machine Learning to forecast Exchange Traded Funds (ETF). A hybrid forecasting model is proposed to predict the closing price of Financial Technology (FinTech) ETF. The proposed forecasting hybrid models consider seven machine learning models. The Diebold-Mariano test (DM test) is applied to determine whether forecasts have outperformance among the proposed method and the original methods. The experimental results show that the prediction accuracy rates with the proposed hybrid models of feature important scores based on higher than 0.1 and 10% data for the test are superior to those of without input feature selection for K-Nearest Neighbor (KNN), Elastic Net, and Least Absolute Shrinkage and Selection Operator (LASSO) models. The DM test shows that the proposed hybrid KNN model performs best. Further, a hybrid stacking model integrates the seven models as the first layer structure and one of the seven machine learning models as the second layer structure to increase the prediction accuracy. Overall, the proposed hybrid stacking Support Vector Machine (SVM) model outperforms the other models. This research results may help investors in making investment decisions, minimizing the investment risk, and reaching a relatively stable return of FinTech ETFs.

並列關鍵字

Machine Learning ； Financial Technology ETF ； Hybrid Forecasting Model ； Random Forests Model ； Importance Analysis ； Hybrid Stacking Model

參考文獻

1. Baker, M., and Wurgler, J. (2006). Investor Sentiment and the Cross-Section of Stock Returns. Journal of Finance, 61 (4), 1645-1680.

Google Scholar

2. Berrar, D. (2018). Bayes’ Theorem and Naïve Bayes Classifier. Encyclopedia of Bioinformatics and Computational Biology, 1, 403-412.

Google Scholar

3. Boateng, E. Y., Otoo, J., and Abaye, D. (2020). Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review. Journal of Data Analysis and Information Processing, 8 (4), 341-357.

Google Scholar

4. Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A Training Algorithm for Optimal Margin Classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 92, 144-152.

Google Scholar

5. Breiman, L. (2001). Random Forests. Machine Learning, 45 (1), 5-32.

Google Scholar

國際替代計量

應用混合式機器學習於金融科技型ETF價格之預測

未授權

主題瀏覽