資料間隱含關係的挖掘與展望

資料挖掘，指由大量資料中擷取出有價值之知識，亦即將資料轉換成知識的行為。這些資料包括各型態的資料，如一般的交易資料與多媒體資料，而知識則是資料間隱含關係的具體表達與呈現。因為資料挖掘能協助企業從資料中取得知識並創造競爭優勢，故引起廣大的重視，也促進了許多新的研究方法與系統的發展，而成為一個快速成長的領域。對於目前現有的資料挖掘方法和資料挖掘系統，本文根據“資料間隱含關係”的不同，提出了九種不同的類別，分別是資料關聯性、順序性、結構性、週期性、類似性、有趣性、個人性、合用性、歸納性，對每一種資料關係，我們將介紹其定義、應用狀況、研究現況和其研究展望。本文除了可幫助讀者了解資料挖掘領域的現況外，也提供了有用的資料挖掘分類方法並且介紹了資料挖掘的比較性研究。

關鍵字

資料挖掘；知識；資料間隱含關係

並列摘要

Data mining is an extraction of useful knowledge from a huge amount of data. The data can be of a variety of types, such as transaction data, relational data and multimedia data, whereas knowledge is an explicit expression and representation of implicit data relation. Since that data mining can assist business to get knowledge and create competitive advantage, it is not surprising that a great number of researches have been done in this field. Because of its fast-growing development and abundant results, it is difficult to provide a complete survey to cover all the issues in a single paper. Therefore, this paper only provides a reasonably comprehensive report for the recent development of data mining technology. As to the present data mining methods and systems, this paper suggests 9 distinct categories according to their implicit data relation. These relations include association, sequence, structure, periodicity, similarity, interestingness, personalization, suitability and generalization. For each of them, we will discuss its definition, applications, algorithms and future research directions. The contributions of this paper include (1) a classification based on the implicit data relation is proposed, (2) a comparative study between these categories has been done, and (3) The state of the art for each category is described.

並列關鍵字

Data mining ； knowledge ； Implicit Data Relation

參考文獻

Agarwal, R.,Aggarwal, C.,Prasad, V. V. V.(2001).A tree projection algorithm for generation of frequent item sets.Journal of Parallel and Distributed Computing.61(3)

Google Scholar

Agrawal, R.,Bayardo, Jr. R. J.,Srikant, R.(1999).Athena: Mining-based Interactive Management of Text Databases.IBM.

Google Scholar

Agrawal, R.,Faloutsos, C.,Swami, A.(1993).Efficient Similarity Search in Sequence Databases.Lecture Notes in Computer Science.730

Google Scholar

Agrawal, R.,Lin, K.,Sawhney, H. S.,Shim, K.(1995).Proceedings of the 21th International Conference on Very Large Databases.Zurich, Switzerland:

Google Scholar

Agrawal, R.,Srikant, R.(1994).Proceedings of the 20th International Conference on Very Large Databases.Santiago:Chile.

Google Scholar

被引用紀錄

蔡運生（2011）。利用資料探勘技術分析WIFLY用戶通路移轉分析〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2011.01203

許俊傑（2007）。MIHSPM:一個多項目集的混合循序樣式探勘演算法〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2007.00920

邱瑋亭（2006）。消費者迷的對象、消費行為、產品與廣告代言人選擇關聯性探勘之研究〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2006.00398

曾冠倫（2017）。以工業4.0為基礎之智慧工廠大數據平台建置〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201700450

Shieh, Y. C. (2006). 應用序列樣式探勘技術於行為變化之研究 [master's thesis, Yuan Ze University]. Airiti Library. https://doi.org/10.6838/YZU.2006.00232

國際替代計量

資料間隱含關係的挖掘與展望

全文下載

主題瀏覽