利用強化特定牌型得分方法改良麻將程式

近年來隨著人工智慧在電腦對局領域蓬勃發展，電腦對局領域的程式強度已經有了明顯成長。麻將為一個多玩家、機率性且不完全資訊的遊戲，由於麻將遊戲隨機且有限資訊的特性，也增加了遊戲的複雜度及困難度。本篇論文將會針對臺灣麻將遊戲的規則，並且參考先前的相關研究，針對麻將程式進行改良。本篇論文將延續「利用棄牌資訊強化策略改良麻將程式」論文，使用規則導向與進胡數計算為主要的程式架構，並且針對原程式中的缺點，提出相對應的演算法進行改良。本論文將會依照進攻與防守兩個方面進行改良。進攻方面以原先的進胡數計算為主要架構達到快速胡牌的目標，並且進一步注重在台分的獲取，能夠獲取更多的分數。防守方面進一步降低程式放槍率，降低分數的損失。實驗數據顯示，改良後的程式 Seofon_v2，其不同版本與原版程式 Seofon 進行對戰，皆能夠獲得超過56%的勝率。

關鍵字

麻將；不完全資訊賽局；規則導向；台分

並列摘要

In recent years, with the vigorous development of the field of computer games in artificial intelligence, the strength of computer games programs has been promoted dramatically. Mahjong is a multiplayer, probabilistic, imperfect information game. These characteristics increase the complexity and difficulty of Mahjong game. This thesis will focus on the rules of Taiwan Mahjong game, refer to the related research, and make improvements on the Mahjong program. We follow up on the previous thesis titled “Using the Enhancement Strategy from Discarded Tiles Information to Improve Mahjong Program”that used the rule-based approach and the computation of “deficiency number”to be the main framework. Aiming at the shortcomings of the original program, some corresponding algorithms are proposed for improvement. The efforts are made in two aspects, offensive and defensive. For the offensive objective, it not only keeps the original framework of computing the“deficiency number”in order to achieve the goal of a quick win, but also wants to get more Tai (equivalent to Faan) scoring units if it is possible. For the defensive objective, it will furthermore focus on avoiding discarding a tile that will make an opponent win the game. It means that the program will try to decrease the loss of scores. The experimental results show that our proposed algorithms implemented on the program Seofon_v2 have more than 56% win rate against the original program Seofon.

並列關鍵字

Mahjong ； Imperfect Information Game ； Rule-Based ； Faan

參考文獻

[1] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche,

Google Scholar

J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe,

Google Scholar

J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, Mastering the Game of Go with Deep Neural Networks and Tree Search, Nature, vol. 529, no. 7587, pp. 484–489, 2016.

Google Scholar

[2] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. V. D. Driessche,

Google Scholar

T. Graepel, and D. Hassabis, Mastering the Game of Go without Human Knowledge, Nature, vol. 550, no. 7676, pp. 354–359, 2017.

Google Scholar

主題瀏覽