透過您的圖書館登入
IP:18.189.182.96
  • 學位論文

基於程式相態特性分析與機器學習的矢量友好性預測

Estimation of Vector Friendliness Based on Program Phase Profiles and Machine Learning

指導教授 : 洪士灝

摘要


目前的大部分處理器都提供了向量指令, 這些指令可以藉由同時使用多個計算單元,提供比標量指令更高的性能。 雖然目前存在的編譯器都支援這些向量指令,但是編譯器可能因為一些限制使其將適合向量化的程式編譯成純量程式碼,例如非單位步長,條件分支和指針。 我們提出一個新的指標: 向量友善度(Vector Friendliness),向量友善度代表一個程式相位適合向量化的機率。再者,我們還定義了一個新名詞 Vector Intensive Phase (VIP),只要相位中向量行數大於 50%,我們就認定這個相位為 VIP。為了找到 VIP,我們利用一種機器學習模型 Recurrent Neural Network (RNN) 學習怎麼樣的 memory trace 是適合向量化的,其中,模型的輸入是 memory trace,模型的輸出為向量友善度。我們收集了許多程式,並且利用程式相位為主的模擬器去抽取 memory trace,接著使用一套自動標籤系統將所有收集到的程式相位標籤為 VIP 或者 non-VIP,除此之外,我們還提出一套合成資料生成器,用來合成更多的 memory trace,訓練完後,我們的模型準確度達90\%,我們發現利用 memory trace 去判斷 VIP 是可行的。在預測向量友善度之後,我們可以進一步去預測一個程式相位是否適合放在 Xeon Phi 上執行,我們使用多種指令比例作為 Support Vector Machine (SVM) 的輸入來預測 Xeon Phi friendliness,Xeon Phi friendliness代表一個程式相位適不適合放在 Xeon Phi 上執行,最終模型的準確度為 85%。

並列摘要


T Many of the today's processors provide vector instructions that can utilize multiple computing units in parallel to deliver higher performance than scalar instructions. While vectorizing compiler techniques exist to take advantage of vector instructions, it is often that the compiler fails to vectorize code sequences that could be manually converted into vector codes, due to restrictions such as non-units stride, conditional branches, and pointer. We propose Vector Friendliness to quantize the probability that the program phase is suitable for vectorization. We also defined the Vector Intensive Phase (VIP). VIP represents a code sequence that it is suitable for vectorize. In order to find the VIP, we leverage Recurrent Neural Network (RNN) to recognize which program phase can be vectorized in terms of memory traces, help programmers identify the program phases that could have been vectorized manually, but not done by the compiler. Moreover, we collect programs from benchmarks and apply a program phase based profiler to extract the memory trace. Then use the proposed labeling system automatically classifies these program phases. After training, the accuracy of our model comes to 90\%. We found that using memory traces to classify VIP is feasible. Beyond vector friendliness, we use Support Vector Machine (SVM) to analyze the ration of various types of instructions and then report Xeon Phi friendliness.

參考文獻


[2] Intel AVX-512 instructions. https://software.intel.com/en-us/blogs/2013/avx-512-instructions.
[7] J. M. Cebrian, M. Jahre, and L. Natvig. Parvec: vectorizing the parsec benchmark suite. Computing, 97(11):1077–1100, 2015.
[10] S.-C. Chen and D. J. Kuck. Time and parallel processor bounds for linear recurrence systems. IEEE Transactions on Computers, 100(7):701–717, 1975.
[12] A. Duran, X. Teruel, R. Ferrer, X. Martorell, and E. Ayguade. Barcelona openmp tasks suite: A set of benchmarks targeting the exploitation of task parallelism in openmp. In Parallel Processing, 2009. ICPP’09. International Conference on, pages 124–131. IEEE, 2009
[13] B. Efron. Estimating the error rate of a prediction rule: improvement on crossvalidation. Journal of the American statistical association, 78(382):316–331, 1983.

延伸閱讀