透過您的圖書館登入
IP:3.139.70.131
  • 期刊

格網運算環境於序列型樣探勘之設計與實作

The Design and Implementation of a Grid-Computing Environment for Mining Sequential Patterns

摘要


本論文提出格網運算環境於序列型樣探勘之設計與實作。本研究實作一Apriori-like演算法的序列型樣探勘於格網運算環境,並加以驗證、分析其探勘效能與結果。Apriori-like演算法相較於相關序列型樣探勘的演算法而言,探勘過程需歷經大量重覆性與遞迴式的資料處理與演算,缺乏高效率的執行效能。但Apriori-like演算法透過修改少量的資料探勘演算程序,即可適用於鬆散耦合的分散式處理,並實行分散任務於格網運算環境。本研究所提出的格網運算環境中,設計了運算格網與資料格網等兩種格網節點型態,所有的格網節點皆以Globus Toolkit實作,每一格網節點安裝與設定本研究所開發的分散探勘程式。格網服務程序為透過使用者或遠端格網節點所觸發之程序,並賦予回應探勘結果至格網主控端,相互合作地完成探勘任務。格網運算環境主要分散於兩個不同的大學校園網路,安裝與設定了16台格網節點,每一格網節點為獨立電腦主機,每台電腦皆配置著不同的硬體元件,藉以呈現真實格網運算的實作環境。最後,經由本研究之實驗結果與效能評估顯示,格網運算環境可提供高度彈性與高效能之運算平台,適用於大容量資料庫的序列型樣探勘。

並列摘要


This paper presents the design and implementation of a grid-computing environment for mining sequential patterns. An Apriori-like algorithm for mining sequential patterns is deployed in the proposed grid-computing environment. Apriori-like algorithm is not of very high performance in comparison to others but it is more convenient to be realized for distributed processing in a grid computing environment due to its nature of loosely coupled processing. Two types of grids are designed, the computing grid and data grid, in the proposed environment. All grid nodes are installed with full functions implementing the mentioned Apriori-like algorithm for mining sequential patterns, each of which is wrapped by Globus Toolkit. Grid services are invoked by the users or other grids and able to respond to the invoking side for cooperatively completing the mining task. There are 16 computers serving as grid nodes each of which is equipped with different hardware components and is distributed across two WANs. The experimental results show that the proposed grid-computing environment provides a flexible and efficient platform for mining sequential patterns from large datasets.

參考文獻


張昭憲、周定賢()。
Agrawal, R.,Srikant, R.(1995).Mining Sequential Patterns.Proceedings of the 11th International Conference on Data Engineering.(Proceedings of the 11th International Conference on Data Engineering).
Agrawal, R.,Shafer, J. C.(1996).Parallel Mining of Association.IEEE Transactions on Knowledge and Data Engineering.8(6),962-969.
Ali, A.,Anjum, A.,Azim, T.,Bunn, J. J.,Mehmood, A.,McClatchey, R.,Newman, H. B.,Rehman, W.,Steenberg, C.,Thomas, M.,Lingen, F.,Willers, I.,and Zafar, M. A.(2005).Resource Management Services for a Grid Analysis Environment.Proceedings of the 34th International Conference on Parallel Processing Workshops.(Proceedings of the 34th International Conference on Parallel Processing Workshops).
Alpdemir, M. N.,Mukherjee, A.,Paton, N. W.,Watson, P.,Fernandes, A. A. A.,Gounaris, A.,Smith, J.(2003).Service-Based Distributed Querying on the Grid.Proceedings of the 1st International Conference on Service-Oriented Computing.(Proceedings of the 1st International Conference on Service-Oriented Computing).

延伸閱讀