透過您的圖書館登入
IP:3.14.247.5
  • 學位論文

尋找多核心系統架構程式最佳化機會之新方法

A New Prospect in Finding Optimization Opportunities for Multicore Architectures

指導教授 : 劉邦鋒
共同指導教授 : 游本中
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


程式最佳化主要可以分爲兩種方式-靜態最佳化以及動態最佳化,這兩種最佳化方式在處理單核心系統架構中的程式都有不錯的表現。但是在多核心系統新架構下,兩種最佳化方式在找尋程式中的最佳化機會時,都沒有將多執行緒程式執行緒間的互動列入考慮。我們的目的是要發展出一個能夠辨認出執行緒間互動的技術,並且利用這些資訊來幫助程式最佳化。經由觀察發現,執行緒間的互動像是競爭公用快取記憶體,可能導致“不穩定”的程式行爲。也就是說,執行程式中相同的程式碼片段,理論上會有相同或相似的效能,但實際上卻有很大的差異, 就會被稱作“不穩定”。我們將這些不穩定的程式片段視爲“最佳化的機會”, 希望能夠藉由最佳化這些片段,使它們恢穩定,進而提升執行效能。 我們提出了一個簡單而且有效率的方法,藉由取樣以及分析基本區塊的效能變化,來分辨出哪些基本區塊是“穩定”,而哪些是“不穩定”。分析得到的結果能夠讓動態最佳化器用來分辨在程式執行的過程中,哪些基本區塊是不穩定的,需要特別注意或特殊處理。我們可以藉由將取樣到的指令指標對應回它所屬的基本區塊,進而找到出現次數最多的基本區塊,用來當作程式執行區間的代表。藉此可以比較不同執行緒中有相同代表的程式區段的效能,計算出每個區段的效能變化。這個方法也能夠應用在單一執行緒的程式。

並列摘要


There have two groups of works in optimizing program execution in the literature – static and dynamic program optimization. To our best knowledge, neither of these optimizations, while looking for optimization opportunities, considers interactions among threads in multi-core architecture. Therefore we would like to develop techniques that can identify the presence of thread interactions and use it to guide possible optimization. We observe that interaction among threads, like competition for shared cache, can lead to “unstable” execution performance. That is, the same part of program will have very different performance characteristics, therefore we identify those parts of program as dynamic optimization opportunity, so that they can be optimized for better performance. We propose a simple and efficient sampling method that analyzes performance variance among basic blocks, so as to differentiate “unstable” and “stable” basic blocks. The results from the analysis can be used as a reference to determine which parts of the program on which dynamic optimizer should make extra efforts during execution. By mapping EIP of each sample back to its basic block, we are able to choose representative basic block for each interval during execution, and compare the performance of each thread, so as to calculate the performance variance of each basic block. This sampling technique can also be applied to single-threaded programs.

參考文獻


[1] V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. SIGPLAN Not., 35(5):1–12, 2000.
[2] D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In CGO ’03: Proceedings of the international symposium on Code generation and optimization, pages 265–275, Washington, DC, USA, 2003. IEEE Computer Society.
[7] T. Kistler and M. Franz. Continuous program optimization: A case study. ACM Trans. Program. Lang. Syst., 25(4):500–548, 2003.
[8] G. Hamerly, E. Perelman, and B. Calderd. How to use simpoint to pick simulation points. SIGMETRICS Perform. Eval. Rev., 31(4):25–30, 2004.
[9] A. Das, J. Lu, and W.C. Hsu. Region monitoring for local phase detection in dynamic optimization systems. In CGO ’06: Proceedings of the International Symposium on Code Generation and Optimization, pages 124–134, Washington, DC, USA, 2006. IEEE Computer Society.

延伸閱讀