電腦象棋之研究

電腦象棋是人工智慧研究的重要領域。遊戲代理（game agents）一直是人工智慧研究的切入點和評測基準，並且象棋是世界上最受歡迎的棋盤遊戲之一，因此，研究電腦象棋所使用的技術是很有價值的。根據象棋比賽的進行，這些技術的使用可以大致分為三個階段，即開局，中局和殘局。本論文首先將工作層級搜尋法（job-level search）應用於alpha-beta搜尋法，基於最佳優先搜尋法（BFS）版本的MTD（f），我們提出了工作層級alpha-beta搜尋法（JL-ABS），工作層級搜尋法是一種通過將工作（jobs）分配給遠端工作機（remote workers）進行平行處理來解決電腦遊戲的應用問題的方法。本論文爲了論證JL-ABS而使用該演算法在象棋開局庫中進行分析。實驗結果表明，在工作層級系統中使用16個工作機時，JL-ABS的速度提高了10.69倍。另外，本論文介紹改進的比較訓練法（comparison training），並應用在自動特徵權重調整，目標是改善象棋程式中使用的審局函數（evaluation functions）。首先，我們應用n-tuple網絡（n-tuple networks）提取特徵，n-tuple網絡通過其大量特徵而只需要很少的專家知識，同時又可以輕鬆存取。其次，我們提出了一種改進的比較訓練法，將tapered eval納入了其中。實驗表明，在使用相同特徵和相同象棋程序的情況下，自動調整的特徵權重相對於手動調整的特徵獲勝率為86.58％。然後，通過添加額外的特徵（尤其是n-tuple特徵）改進了上述的訓練過的版本。改進後的訓練版本與沒有額外的特徵的訓練版本相比，勝率達到了81.65％。最後，長將和長捉是象棋的兩個特殊規則，在殘局資料庫中，distance-to-check和distance-to-chase這兩個度量標準分別記錄了從當前盤面到應用這兩個規則的盤面的所需的步數距離。過去，這兩個度量標準分別記錄在資料庫中的不同層別（level）。然而，如果不比較兩個度量標準的距離，那麼殘局策略效率不高。此外，層別的數量越多，資料庫就越大。在本論文中，我們修改了回溯分析演算法（retrograde analysis，一種用於建構殘局資料庫的演算法），以合併兩個特殊規則的度量標準。實驗表明，殘局策略變得更有效率，並且殘局資料庫所需的儲存空間也減小了。

關鍵字

象棋；電腦遊戲；機器學習； α-β搜尋

並列摘要

Computer Chinese chess is an important field to researches in artificial intelligence. Game agents have been an entry point and benchmark for artificial intelligence research and Chinese chess is one of the most popular board games in the world. Thus, it is worthy to study the techniques applied in computer Chinese chess. According to the progress of the Chinese chess game, the adopted techniques can be roughly classified into three stages, namely the opening, the midgame, and the endgame. This thesis first applies job-level (JL) search to alpha-beta search and proposes a job-level alpha-beta search (JL-ABS) algorithm based on a best-ﬁrst search version of MTD(f). Job-level search is an approach for solving computer game applications by dispatching jobs to remote workers for parallel processing. The JL-ABS algorithm is demonstrated by using it in an opening book analysis for Chinese chess. The experimental results demonstrated that JL-ABS reached a speed-up of 10.69 when using 16 workers in the JL system. In addition, this thesis describes the application of a modified comparison training for automatic feature weight tuning. The objective is to improve the evaluation functions used in Chinese-chess programs. First, we apply n-tuple networks to extract features. N-tuple networks require very little expert knowledge through its large numbers of features, while simultaneously allowing easy access. Second, we propose a modified comparison training into which tapered eval is incorporated. Experiments show that with the same features and the same Chinese-chess program, the version with automatically tuned feature weights achieved a win rate of 86.58% against the version with hand-tuned features weights. The trained version was then improved by adding additional features, most importantly n-tuple features. This improved version achieved a win rate of 81.65% against the trained version without additional features. Finally, perpetual check and perpetual chase are two special rules in Chinese chess. Two metrics, distance-to-check and distance-to-chase, are adopted in endgame databases to respectively record the distances measured in plies from the current position to the positions where the two rules apply. In the past, the two metrics are separately recorded at different levels in the databases. However, the endgame strategy is not efficient if the distances of the two metrics are not compared. Moreover, the more the number of levels is, the bigger the database is. In this thesis, we modify retrograde analysis, an algorithm to build endgame databases, to merge the two metrics for special rules. The experiments show that the endgame strategy becomes efficient and the sizes of endgame databases are reduced as well.

並列關鍵字

Chinese chess ； Computer games ； Machine learning ； Alpha-beta search

參考文獻

Allis, L. V., “Searching for Solutions in Games and Artificial Intelligence,” Ph.D. Thesis, University of Limburg, Maastricht, The Netherlands, 1994.

Google Scholar

Allis, L. V., van der Meulen, M., and van den Herik, H. J., “Proof Number Search,” Artificial Intelligence, vol. 66(1), pp. 91-124, 1994.

Google Scholar

Baier, H. and Winands, M. H. M., “Active Opening Book Application for Monte-Carlo Tree Search in 19×19 Go,” In Proceedings of the 23rd Benelux Conference on Artificial Intelligence (BNAIC 2011), pp. 3-10, 2011.

Google Scholar

Baxter, J., Tridgell, A., and Weaver, L., “Learning to Play Chess Using Temporal Differences,” Machine Learning 40(3), 243-263, 2000.

Google Scholar

Beal, D. F., “A Generalised Quiescence Search Algorithm,” Artificial Intelligence 43(1), 85-98, 1990.

Google Scholar

國際替代計量

電腦象棋之研究

全文下載

主題瀏覽