透過您的圖書館登入
IP:3.16.169.100
  • 學位論文

針對多核心軟體效能提升之資料溝通瓶頸分析

Communication Bottleneck Analysis for Improving the Performance of Multicore Software

指導教授 : 熊博安
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


軟體效能分析是一個重要且困難的議題,長久以來有許多學者隨著軟硬體的發展,提出各種軟體效能分析與塑模的方法。軟體效能塑模可以協助軟體開發者在開發過程中,預先評估與預測軟體效能。隨著硬體效能的提升,軟體架構的設計越來越趨於複雜,開發在核心系統上運行的多核心軟體,更應該避免花費大量的時間與人力資源來做軟體最佳化,將軟體效能調整到滿足既定的效能需求。隨著多核心處理器的發展,多核心處理器已經漸漸成為主流的處理單元。許多軟體也開始進行開發與修改支援平行運算的平行版本,來取得最佳的軟體效能。開發一個多核心軟體在多核心環境上運行,所必須考慮到的因素,比傳統單核心環境要複雜許多。例如系統資源競爭的問題、同步機制以及共享快取記憶體資源競爭等問題。在傳統的軟體開發流程當中,軟體的效能往往都是在整個軟體都已經開發完成後,才開始進行測量。如果軟體的效能如期達到既定的效能需求,整個軟體開發流程也可以告一段落。但是如果軟體效能無法達到效能需求,對於多核心軟體而言,將會相當困難去進行修改,一方面無法簡單的判斷出效能瓶頸是在什麼地方,另一方面也可能是整體的軟體架構設計有問題,導致軟體效能低落,由於大部分的軟體程式碼都已經實作完成,因此如果要做修改,將會非常困難,而且大多數的程式碼都可能面臨到必須重新實作的問題。因此,利用一個準確的軟體效能塑模來協助軟體開發者預測軟體的效能,讓軟體能夠在多核心系統上發揮出最佳的效能,是十分重要的。本篇論文,首先針對影響多核心軟體的三項主要因素作分析,分別為:平行度(Parallelism)、溝通模式(Communication pattern)與資料密集度(Locality),並且分析多核心軟體效能瓶頸與三項主要因素之間的關聯性。我們提出一套針對溝通導向之軟體效能分析(Communication-Oriented Performance Estimation, COPE)方法,來協助軟體開發者分析與檢測多核心軟體效能瓶頸,並且給予軟體開發者建議如何在目前的軟體架構與軟體設定值當中,設定適當的執行序數目,來取得最佳的軟體效能。

並列摘要


Performance modeling can assist the developer to estimate system performance at an early stage, such that the huge cost incurred by tuning software to achieve the target performance can be avoided. Since multicore processors have already become mainstream for computing, many applications are being derived as parallel software to enhance the performance. Developing an application on multicore platforms is more complex and more factors than single core platforms need to be considered. For example resource contention, synchronization problem, and shared cache conflict are some critical factors . Traditionally, the performance was measured after most code were implemented. If the performance does not meet application requirements, it is very difficult to modify the applications in parallel version due to a large amount of the code needs to be modified. Thus, an accurate model is needed to estimate the system performance and further guide developers to tune for optimal performance. In this work, we first analyze the impact of performance from three factors including thread parallelism, communication pattern, and data locality. Moreover, we analyze the performance bottleneck from correlation of the three factors. We propose a communication-oriented performance estimation method to assist the programmer to detect and analyze performance bottlenecks. Furthermore, we suggest the adjustment of the number of threads to obtain better performance from current configuration.

參考文獻


[3] J. Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Proces-sor Parallelism. O’Reilly, July 2007.
[4] V. Sarkar. Determining Average Program Execution Times and Their Variance. In Proceedings of the ACM SIGPLAN Conference on Programming language design and implementation, pages 298–312, 1989.
[5] C. Bienia, S. Kumar, J.P. Singh, and K. Li. The PARSEC Benchmark Suite: Char-acterization and Architectural Implications. InProceedings of the International Con-ference on Parallel Architectures and Compilation Techniques, October 2008.
[6] M.D. Ernst. Static and dynamic analysis: Synergy and duality. InProceedings of the WODA ICSE Workshop on Dynamic Analysis, pages 24–27, May 2003.
[9] Q. Wu and V.V. Datla. On Performance Modeling and Prediction in Support of Sci-entific Workflow Optimization. InProceedings of the IEEE Congress on Services, pages 161–168, July 2011. 54

延伸閱讀