使用深度神經網路計算單電子問題的能階與波函數

我們使用深度神經網路來解決物理學、量子力學上著名的方程式–薛丁格方程式。薛丁格方程式有分為含時與不含時的形式、單粒子或多粒子問題，不同的位勢也會有不同的解，甚至能將方程式推廣到更高維度空間。問題其實是很廣泛且複雜的，因此，我們在此篇論文將針對討論的是不含時的單電子問題，並將其定義域限制在二維的盒中(2D box)。在程式語言方面，我們使用Python以及Tensorflow套件來撰寫深度神經網路。我們將使用全連接神經網路(FCNN)和深度殘差網路(ResNet)的模型架構，去探討薛丁格方程式在兩種不同的位勢下的波函數以及其對應的能量，包含無限深位勢阱(infinite potential well)以及簡諧振子(simple harmonic oscillator)兩種位勢。此程式的特色有幾點：(1)使用非監督式學習。(2)使用3種不同方式將總損失函數做改變，並增加了1∼2個懲罰參數。(3)使用二階優化演算法—擬牛頓法的BFGS，來取代一階優化演算法—梯度下降法、Adagrad、Adam等。(4)使用兩次BFGS來提升解(波函數及其對應的能量)的精準度。(5)可訓練參數的個數少(800∼1000個參數)，相對訓練時間短。此篇論文除了使用深度神經網路解決偏微分方程式的特徵值問題以外，最大的突破為能夠同時訓練出薛丁格方程式的波函數(即特徵向量)跟能量(即特徵值)，並且使用了Wielandt deflation的技巧讓其能夠依照能階小到大依次訓練出其解。

關鍵字

全連接神經網路；殘差網路；薛丁格方程式； BFGS ； Wielandt 緊縮；二次懲罰函數法；激發態能量

並列摘要

We use deep neural networks to solve the Schrödinger equation which is well-known in physics and quantum mechanics. The biggest breakthrough in this thesis is that we are able to train a model to get several energies and corresponding wave functions simultaneously. Together with the Wielandt deflation technique, the obtained energies are in the ascending order of energy levels. In addition, we use fully connected neural network (FCNN) and residual network (ResNet) as models to find energy levels and wave functions with the systems under two different external potentials, infinite potential well and simple harmonic oscillator. There are some features in our method: (1) It is hard to create labels for our training data, so we use unsupervised learning. (2) We change the total loss function in three different ways through adding one or two penalty parameters. (3) We use a quasi-newton method, BFGS, which is a second-order optimization algorithm instead of using first-order optimization algorithms, such as gradient descent, Adagrad and Adam. (4) In order to improve accuracy of the solution (wave function and its corresponding energy), we use BFGS twice. (5) There are just a small number of trainable parameters (800∼1000) in our models so that it takes less time to train a model.

並列關鍵字

Fully connected neural network ； Residual network ； Schrödinger equation ； BFGS ； Wielandt deflation ； Quadratic penalty method ； Excited state energy