基於搜尋日誌及排序學習之新式台灣景氣狀態監測系統

景氣狀態監測對於政府及企業是個重要的議題，多數研究使用經濟指標監測景氣狀態。由於經濟指標需由不同政府部門協力完成，往往需要經過漫長的處理時間，導致景氣狀態發佈的延遲，而發佈的延遲將會增加政府及企業決策的不確定性。為了克服這個問題，本研究基於搜尋日誌建構了新式的景氣狀態預測模型，此模型使用排序學習演算法，從搜尋引擎回傳的小量高排名文件中選取最能夠代表景氣狀態的詞彙，接著取得這些詞彙在網路上的搜尋日誌，結合先進的機器學習模型來預測景氣狀態的變化。由於搜尋引擎提供的文件及搜尋頻率具有高度的即時性與可用性，因此基於搜尋詞彙所制定的經濟指標可以有效降低景氣狀態發佈的延遲，進而降低決策上的不確定性。實驗結果顯示我們所提出的架構能夠準確的預測景氣狀態，且基於排序學習演算法所建構的模型其準確率也優於使用傳統特徵選取方法所建構的模型。

關鍵字

景氣狀態監測；搜尋日誌；排序學習；特徵選取

並列摘要

Purpose-Prosperity surveillance is an important issue for countries and organizations. Generally, the surveillance indicators are comprised of multiple economic variables which are compiled by different government departments. Compiling these variables involves a great deal of data processing, which delays the surveillance of prosperity. Design/methodology/approach-In this paper, we propose a novel prosperity surveillance system that utilizes the search logs from search engine. The system employs learning to rank algorithm to identify discriminative terms that are representative of prosperity. Representative terms and their query frequencies are then applied to a state-of-the-art data mining model to enhance the effectiveness of prosperity surveillance. Findings-The experimental results show that our prosperity surveillance system performs well and our feature selection method based on learning to rank outperforms other popular feature selection methods. Research limitations/implications-This study focused only on using search log information, in our future work, we plan to investigate more information sources (e.g., news posting, internet forum) to enhance the proposed feature selection method. Practical implications-In this paper, we have proposed an effective framework for predicting the status of prosperity in Taiwan, the proposed method can provide effective support for government officials and authorities in order to help them respond to fast-changing events and topics, and make appropriate decisions. Originality/value-This study is, to the best of our knowledge, the first attempt to apply search log and learning to rank to predict the status of prosperity in Taiwan.

並列關鍵字

prosperity surveillance ； search log ； learning to rank ； feature selection

參考文獻

Kannan, S.S. and Ramaraj, N. (2010), ‘A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm’, Knowledge-Based Systems, Vol. 23, No. 6, pp. 580-585.

Google Scholar

Li, Z., Xu, W., Zhang, L. and Lau, R.Y. (2014), ‘An ontology-based Web mining method for unemployment rate prediction’, Decision Support Systems, Vol. 66, pp.114-122.

Google Scholar

Liu, T.Y. (2009), ‘Learning to rank for information retrieval’, Foundations and Trends in Information Retrieval, Vol. 3, No. 3, pp. 225-331.

Google Scholar

Turney, P. (2001), ‘Mining the web for synonyms: PMI-IR versus LSA on TOEFL’, Machine Learning, pp. 491-502.

Google Scholar

Vosen, S. and Schmidt, T. (2011), ‘Forecasting private consumption: survey‐based indicators vs. Google trends’, Journal of Forecasting, Vol. 30, No. 6, pp. 565-578.

Google Scholar

國際替代計量

基於搜尋日誌及排序學習之新式台灣景氣狀態監測系統

全文下載

主題瀏覽