關聯規則應用於證券交易相對關係規則之挖掘

傳統上當關聯規則在分析數值型資料(Numerical Data)時，常見的處理方式就是先將資料進行離散化之後再進行運算，也就是將不同的數值分類到各個長度相同的區間中，故離散化的分類方式是屬於所謂的「絕對數值」比較。但實際上某些數值型資料在本質上並不適於這種離散化處理。以證券市場的資料為例，有許多常見且知名的技術指標的使用方式是屬於「相對數值」的比較，而非「絕對數值」的比較方式。因此，如果我們單純地對數值型資料進行等距或等量離散化而不考慮資料在本質上存在有「相對比較」的關係時，將造成資訊流失的問題。　　因此，本研究將提出「相對數值」的比較關係，讓數值型資料的處理方式不只侷限於等距劃分或等量劃分這種「絕對比較」的資料處理方式，而藉由「相對數值」的比較關係能讓資料的應用方式使得更符合該資料原本的意義，並使關聯規則能夠更加合適地應用在數值型資料上。　　在運用了「相對數值比較」的概念於關聯規則中，並進行資料挖掘之後，本研究將以「分類型關聯規則」的方式，對目標欄位進行分類預測，而分類型關聯規則包括了「規則精簡」與「整體預測」二個步驟，「規則精簡」將利用子集合的概念從所有的規則之中，篩選出較為一般化的規則，以對規則進行簡化與整合的動作，並改善關聯規則產生過多的規則的缺點，最後利用「整體預測」對目標資料進行預測，並利用總合信心水準門檻的機制來提升預測的準確率。本研究將針對2003年1月1日到2006年12月31日的美國股市交易資料，進行上述方法的實驗驗證，相較於只運用「絕對比較」的資料處理方式，在加入了「相對數值」的比較關係之後，無論是在訓練期以及測試期，其預測的準確率都有顯著的提升，而在本研究進一步加入「規則精簡」與「整體預測」的方法之後，也能有效提高預測的準確率，足作為投資決策參考之用。

關鍵字

資料挖掘；關聯規則；相對關係；技術分析

並列摘要

When it comes to analyzing numerical data by Association Rule, we have to disperse those numerical data before we start to use them as a data mining source data. The common data dispersed methods are “equal width interval” and “equal frequency interval”. We categorize these two methods into “absolute”, because both of them classify different values into each interval with the same length. In practice, equal width interval and equal frequency interval are not necessary the suitable way to deal with all kinds of data. For example, the usage of many popular and famous technical analysis indicators is considered “relative-comparison”, rather than “absolute- comparison”. Therefore, if we simply treat all kinds of data as “absolute-comparison” data without thinking about whether those data have “relative-comparison” characteristics in nature, we may lead to information loss because we ignore some important features in those data. 　　For this reason, we propose a concept of “relative-type comparative relation” which is an alternative to “equal width interval” and “equal frequency interval” for data preprocessing. Through “relative-comparison” we can transfer numerical data to data mining source data in a more appropriate way that make the source data more similar into the numerical data in meaning, so that we can reduce information loss and enhance the result of data mining. 　　After applying “relative-comparison” to association rule data mining, we use CBA(Classification Based on Associations) to classify and predict the target data. CBA can be divided in two steps which are “rule simplification” and “collective evaluation.” “Rule simplification” eliminates those redundant rules and integrates those general rules for classification. “Collective evaluation” uses the total confidence of screened rules to classify and predict the target data and enhance the accuracy of classification and prediction. 　　The experimental data is extracted from American stock trading data form 2003 to 2006. The results of the experiments show that the application of “relative-comparison” does improve the precision of stock price estimation. After we implement “rule simplification” and “collective evaluation” in the experiments, we improve the precision rate to a higher level.

並列關鍵字

Technical Analysis ； Relative Relation ； Association Rules ； Data Mining

參考文獻

[5] Agrawal, Rakesh, T. Imielinski and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’93), 1993, pp. 207-216.

[6] Berry, Michael J.A. and G.S. Linoff, Data Mining Technique: For Marketing, Sales, and Customer Relationship Management, Wiley Computer Publishing, New York, 1997.

[9] Fama, E. F “The behavior of stock market prices,” Journal of Business, Vol. 38, 1965, pp. 34-106.

[10] Fama, E. F. “Efficient Capital Markets: A Review of Theory and Empirical Work,” Journal of Finance, Vol. 25, 1970, pp. 383-417.

[11] Fama, E. F. “Efficient capital markets II,” Journal of Finance, Vol. 46, 1991, pp. 1575-1643.

被引用紀錄

周志強（2014）。著重於績效比較的交易策略程式管理平台之研究〔碩士論文，國立中央大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0031-0412201512002833

延伸閱讀

未授權

主題瀏覽