透過您的圖書館登入
IP:3.137.164.210
  • 學位論文

為非耦合軟體管線所設計的鎖無關且尊重快取機制之軟體佇列

Lock-free cache-friendly software queue for decoupled software pipelining

指導教授 : 楊武

摘要


近幾年來,不論是在伺服器或個人電腦領域,多核心平台皆已成為主流。平行化是能充份利用多核心平台所提供額外計算能力的其中一種方法。然而,一般應用程式具有複雜的資料與控制相依性,此種特性使得傳統平行化技術,如: DOALL 和 DOACROSS,無法應用於其上。Decoupled Software Pipelining (DSWP) 是一新的平行化技術,其擁有平行化一般應用程式的潛能。然而,DSWP 的成功有賴於處理器之間高速的傳輸與同步。目前 DSWP 在商用多核心平台上的效能表現並不理想。其主因在於處理器之間傳輸與同步基本上有賴於鎖相關、不尊重快取機制的軟體方式達成。此種作法將會大幅抵消 DSWP 所可能帶來的好處。 我們為 DSWP 提出一個鎖無關、尊重快取機制的軟體佇列。一個鎖無關且尊重快取機制的實作需要考慮到記憶體子系統的兩個不同面向,記憶體一致性 (memory coherence) 和記憶體一貫性 (memory consistency)。我們舉例說明忽略或混淆前述兩個不同面向將如何導致不正確或無效率的實作。之後,我們提出一個正確且有效率的實作, 並同時給出了詳細的解釋。由於平行程式本質上的不確定性,傳統的測試技術無法用來證明其正確性。我們同時以非正式和正式的方法討論我們實作上的正確性。

並列摘要


Multicore has become a trend on server and client computers in recent years. Parallelization is one way to fully utilize the computing power provided by multicore architectures. Most applications of interest have complex data and control dependency, which make traditional parallelization techniques, such as DOALL and DOACROSS, inapplicable. Decoupled Software Pipelining (DSWP), a new parallelization technique, shows its potential on parallelizing general applications. However, its success relies on fast inter-core synchronization and communication. On commodity multicore platforms, the performance of current DSWP disappoints us since the overhead involving lock-based, cache dishonored software approach offsets the benefit from DSWP. We present a lock-free, cache-friendly software queue designed for DSWP. A lock-free, cache-friendly solution need take two different aspects of memory system, memory coherence and memory consistency, into consideration. We show how inattention to these two aspects leads to incorrect or inefficient solutions. We also present our approach to providing a correct and efficient solution with detailed explanation. Due to the nondeterministic nature of parallel programs, traditional testing techniques cannot be used to fully verify the correctness of the implementation. We also discuss the correctness of our implementation both in informal and formal ways.

參考文獻


[2] R. Allen and K. Kennedy, Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers, 2002.
[3] R. Rangan, N. Vachharajani, M. Vachharajani, and D. August, “Decoupled software pipelining with the synchronization array,” in Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society
[4] G. Ottoni, R. Rangan, A. Stoler, and D. August, “Automatic thread extraction with decoupled software pipelining,” in Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2005, p. 118.
International Symposium on Microarchitecture. IEEE Computer Society, 2006, pp. 259–272.
[7] L. Lamport, “How to make a multiprocessor computer that correctly executes multiprocess progranm,” IEEE transactions on computers, vol. 100, no. 28, pp. 690–691, 1979.

被引用紀錄


韓國峰(2011)。公部門導入顧客關係管理系統之個案研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315254312

延伸閱讀