透過您的圖書館登入
IP:18.117.186.92
  • 學位論文

互享資料工作之計算與通訊排程最佳化

Computation and Communication Schedule Optimization for Jobs with Shared Data

指導教授 : 劉邦鋒

摘要


幾乎所有的叢集與格網系統都仰賴資料以計算結果,並且在資料可被取得之前計算工作是無法開始的。因此恰當的安排資料傳輸以及工作執行對於整體的效率可以產生顯著的影響。在本篇論文中我們分別就考慮儲存空間限制與否分析了互享資料工作之排程問題的計算複雜度,我們展示了當儲存空間受到限制時即使每個工作最多只需要三份資料,這個問題仍然是 NP-Complete 的。另一方面,我們也展示了當儲存空間不受限制時,若是每個工作最多只需要兩份資料,那麼我們可以很有效率地找到最佳工作排程。我們也提出了一個很有效率的經驗法則演算法可在工作所需要的資料數量不受限制時找到很好的排程,實驗結果也顯示這個演算法表現地相當好,可以找到非常接近最佳解的排程。

並列摘要


Almost every computation job in the cluster or grid systems requires input data in order to find the solution, and the computation cannot proceed without the required data become available. As a result a proper interleaving of data transfer and job execution has a significant impact on the overall efficiency. In this paper we analyze the computational complexity of the shared data job scheduling problem, with and without consideration of storage capacity constraint. We show that if there is an upper bound on the server capacity, the problem is NP-complete, even when each job depends on at most three data. On the other hand, if there is no upper bound on the server capacity, we show that there exists an efficient algorithm that gives optimal job schedule when each job depends on at most two data. We also give an effective heuristic algorithm that gives good schedule for cases where there is no limit on the number of data a job may access. Experimental results indicate that this heuristic algorithm performs very well, and gives near optimal solutions.

並列關鍵字

shared data, job scheduling

參考文獻


[2] H. Casanova, G. Obertelli, F. Berman, and R. Wolski. The apples parameter sweep template: user-level middleware for the grid. In Proceedings of the ACM/IEEE conference on Supercomputing, pages 75–76, 2000.
[3] M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified npcomplete problems. In Proceedings of the ACM symposium on Theory of computing, pages 47–63, 1974.
[4] A. Giersch, Y. Robert, and F. Vivien. Scheduling tasks sharing files on heterogeneous master-slave platforms. Journal of Systems Architecture, 52(2):88–104, 2006.
[5] S. M. Johnson. Optimal two- and three-stage production schedules with setup times included. Naval Research Logistics Quarterly, 1(1):61–68, 1954.
[6] M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund. Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In Proceedings of the Heterogeneous Computing Workshop, pages 30–44, 1999.

延伸閱讀