透過您的圖書館登入
IP:18.189.182.96
  • 學位論文

以非揮發性記憶體擴大量子電路的模擬與分析

Enlarging Quantum Circuit Simulation and Analysis with Non-Volatile Memories

指導教授 : 洪士灝
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


隨著量子計算的蓬勃發展,高效能量子計算的模擬已成為開發量子計算系統 與應用的相關活躍研究課題。然而,由於大規模的量子電路模擬所需的運行時間 以及記憶體容量會隨著量子位元的數量而呈現指數增長,因此在傳統電腦上模擬 大規模量子電路是一項相當具有挑戰性的任務。目前常見的大規模的量子電路模 擬,會透過叢集來擴大可用的記憶體容量。然而,此方法會為使用者帶來高昂的 金錢成本外、也會帶來大量資料交換所產生的大量的通信間接費用。本論文提出 了一種利用非揮發性記憶體在單台機器上實現大規模量子電路模擬、且兼具成本 效益的優化方法。相比於現今的高端服務器僅能承載 TB 等級大小的 DRAM。TB 等級的 NVMe 硬碟是相當常見的,且常規的電腦亦可裝載輕鬆裝載數百 TB 等級 的 NVMe 磁碟陣列。此外,NVM 磁碟的價格約為 DRAM 的百分之一倍。因此利 用 NVM 記憶體能更輕鬆做到在單一電腦上具成本效益的大量記憶體空間。本方 法不僅利用了 NVM 大量記憶體的優勢,且根據 NVM 存放資料的模式,實現連續 訪問資料的優化以及大量的資料重用。此外,對於常用的特定量子邏輯閘,包括: CNOT、CZ、CS、CP(theta)、SWAP 以及 Toffoli 等量子邏輯閘,我們利用省略不 必要的資料訪問,藉以獲得模擬器運行時間的加速。此外,由於到 NVM 訪問資 料的時間比 DRAM 慢得多,我們提出了一種量子電路排程器,利用將量子邏輯閘 聚合成大型的 N-量子位元邏輯閘,除了能減少電路深度外、亦可減少到 NVM 提 取資料的次數,達到模擬量子電路時間上的加速。 即使本篇論文所提出的方法,並非旨在模擬適合計算機 DRAM 大小的小型量 子電路。然而,由於小電路的結果可作為估計使用 NVM 或 DRAM 作為記憶體所 建造的大型量子電路模擬器之間的性能差距。因此於本論文中,為了評估及驗證 本方法的效能,我們運行了一系列量子電路,並將其運行結果和由牛津大學所提 出的量子模擬器,QuEST,進行運行時間的比較。實驗結果表明,本方法成功讓 使用者能在更低的成本下,在合理的運行時間內做超出 DRAM 記憶體大小的大型量子電路模擬器。而在沒有使用我們提出的量子電路排程器,我們的模擬器因受 限於 PCIe 通道傳輸速度的限制,最糟的實驗結果顯示,本量子電路模擬器運行時 間在小型的量子電路下會是 QuEST 模擬器的 2 倍,而在大型的量子電路下的運行 時間約是 QuEST 模擬器的 10.9 倍。但若使用本論文所提出的量子電路的排程器, 我們的 NVM 大型量子電路模擬器運行的速度可以比 QuEST 高出 1.2 倍的速度, 直接證明了我們所提出的量子電路排程器所能帶來的速度上的效益。

並列摘要


With recent advance of quantum computing, high-efficiency quantum computing simulation has become an active research topic for developing quantum computing sys- tems and applications. However, it is challenging to simulate large-scale quantum circuits on traditional computers as the runtime and memory capacity required for the simulation grow exponentially with the number of quantum bits (qubits). While a computer cluster may be used to enlarge the scale of simulation by extending the computing and memory resources, it incurs very high costs for the users. This paper proposes a cost-effective method for performing quantum circuit simulation on a single computer with non-volatile memories (NVM) and optimization schemes. The proposed method not only takes advan- tage of the large capacity offered by NVM, but also optimizes the data access patterns for NVM for contiguous accesses and data reuse. For specific quantum gates that are pop- ularly used, including CNOT, CZ, CS, CP(theta), SWAP, and Toffoli, we make special arrangement to gain extra speed. In addition, as NVM is accessed via I/O and is much slower than regular memories such as DRAM, we propose a quantum circuit scheduler to aggregate quantum gates into k-qubit unitary gates to reduce the circuit depth and de- creases the number of data fetches of from NVM. To evaluate the performance of the proposed method, we carry out a series of bench- mark circuits and compare it against QuEST, one of the most popular quantum circuit simulators. The experimental results show that our work successfully enables the user to simulate quantum circuits beyond the capacity of regular memories with NVM at an affordable cost and a reasonable speed. In comparison, one NVMe disk already offers terabytes of memory capacity at approximately 1/100 of the price of DRAM, and one typ- ical computer can easily attach arrays of NVMe disks to provide hundreds of terabytes of memory capacity, while today’s high-end server can only host several terabytes of DRAM. While our work is not intended to simulate a small quantum circuit which fits in the DRAM of a computer, the results from small circuits serve as references to estimate the performance gaps between DRAM-based simulation and NVM-based simulation for large circuits, if the user can afford the high cost of DRAM. As the data fetched from NVM can be cached in the system memory for data reuse, the speed of our work varies with the size of the circuits. Without the proposed scheduler, for regular unitary gates and specialized gates, in the worst case scenario, QuEST outperforms our work by 2.0x for smaller circuits and 10.9x for larger circuits in terms of speed, mainly due to the perfor- mance bottleneck caused by I/O operations to access NVM across the PCIe bus. With the proposed scheduler, our NVM-based can outperform QuEST by 1.2x for large random circuits, which demonstrate the effectiveness of the proposed scheduling technique.

參考文獻


[1] Apacer nvme https://www.tweaktown.com/news/86395/apacer-is-first-with-\ pcie-5-0-ssds-up-to-13-000mb-sec-reads/index.html.
[2] Ibm’s 10 quantum device https://www.google.com/search?q=IBM+10+quantum+device+\lineup& client=safari&rls=en&sxsrf=ALiCzsap6Xvvlp5R7BtFTMJe25iBXQQn_ g:\1658823801164&source=lnms&tbm=isch&sa=X&ved= 2ahUKEwin-qabkJb5AhVHB4gKHTASCagQ_AUoAnoECAEQBA&biw=1440&bih= 820&dpr=2#imgrc=MppnJ0tCPSUi2M&imgdii=hYj4VhTlAnxqRM.
[3] Learn quantum computation using qiskit https://qiskit.org/textbook/ch-states/representing-qubit-states. html.
[4] Nvidia gpudirect rdma https://docs.nvidia.com/cuda/gpudirect-rdma/index.html.
[5] Openmp https://www.openmp.org.

延伸閱讀