研究生: |
林宗翰 Lin, Zong-Han |
---|---|
論文名稱: |
利用棄牌資訊強化策略改良麻將程式 Using the Enhancement Strategy from Discarded-Tiles Information to Improve Mahjong Program |
指導教授: |
林順喜
Lin, Shun-Shii |
口試委員: | 許舜欽 吳毅成 顏士淨 陳志昌 周信宏 張紘睿 林順喜 |
口試日期: | 2021/08/25 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 44 |
中文關鍵詞: | 不完全資訊遊戲 、規則導向 、棄牌資訊 、麻將 |
英文關鍵詞: | Imperfect Information Game, Rule-based, Discarded-Tiles Information, Mahjong |
研究方法: | 實驗設計法 、 比較研究 |
DOI URL: | http://doi.org/10.6345/NTNU202101337 |
論文種類: | 學術論文 |
相關次數: | 點閱:134 下載:39 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
麻將是一個多人、機率型、不完全資訊遊戲,做為一個歷史悠久且熱門的遊戲,根據玩家地域不同而發展出了許多地區性規則,本篇論文將以台灣麻將作為研究課題,以前人提出的規則導向架構為基礎對麻將程式進行改良。
架構上延續「利用他家資訊模組改良麻將程式」論文,採用進胡數計算的方式拆分手牌,並且針對該架構的弱點提出演算法改進,發展出數個構想嘗試改善遇到的程式缺陷。
在原本的架構中,會蒐集其他玩家的棄牌資訊用以推論其他玩家不需要的牌,並且應用在遊戲終盤防守。而本論文則將棄牌資訊同時應用在輔助構築手牌,利用其他玩家的棄牌資訊調整手牌的權重,在剩餘張數相近的狀況下能夠將手牌導向更容易透過他人棄牌進牌的狀態,將比原本只依賴進牌機率的策略更具有積極性。
實驗數據顯示,被命名為Seofon的新版程式對上原版程式zei得到56%的勝率,並且在ICGA 2020、TCGA 2021與ICGA 2021電腦對局競賽的麻將項目分別取得2銀1金的成績,與另外兩支金牌程式互有勝負。
Mahjong is a multiplayer, probability, imperfect information game and it is historic and popular. There are many kinds of regional rules that have been developed. In this thesis, we will use the Taiwanese rule as a research topic and improve the Mahjong program based on the rule-based framework built by previous researchers.
The framework was developed by the thesis titled “Using Other Players' Information Models to Improve Mahjong Program”. It split the hand cards by using the deficiency information, and developed several ideas to improve the program by making up the defects.
In the original framework, the model will collect the other player’s discarded-tiles information for defense in the end game, and we try to use the information to build our hand cards in this thesis. We adjust the weights of the hand cards. If the number of the remaining cards is similar, the hand cards will be directed to use the other player’s discarded-tile for win. The strategy will be more aggressive than the original one that only relies on the probabilities.
The experimental results show that the new program named Seofon has a 56% win rate against the original program named zei. And it won the silver medal both in ICGA 2020 and TCGA 2021, and won the gold medal in ICGA 2021 computer game tournaments in Mahjong. It was nip and tuck between the other two programs which won the gold medal.
[1] 徐讚昇、許舜欽、陳志昌、蔣益庭、陳柏年、劉雲青、張紘睿、蔡數真、林庭羽、范綱宇,電腦對局導論,2017,國立臺灣大學出版中心。
[2] 維基百科,麻將基本的規則介紹,https://zh.wikipedia.org/wiki/%E5%8F%B0%E7%81%A3%E9%BA%BB%E5%B0%87。
[3] 陳新颺,電腦麻將程式ThousandWind的設計與實作,2013,國立台灣師範大學資工所碩士論文。
[4] 吳俊緯,電腦麻將程式MahJongDaXia的設計與實作,2015,國立台灣師範大學資工所碩士論文。
[5] 林猷琛,利用他家資訊模組來改良麻將程式,2020,國立台灣師範大學資工所碩士論文。
[6] Donald E. Knuth and Ronald W. Moore, An Analysis of Alpha-Beta Pruning, Artificial Intelligence, Vol. 6(4):293-326, 1975.
[7] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, Mastering the Game of Go with Deep Neural Networks and Tree Search, Nature, Vol. 529, no. 7587, pp. 484–489, 2016.
[8] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. V. D. Driessche, T. Graepel, and D. Hassabis, Mastering the Game of Go without Human Knowledge, Nature, Vol. 550, no. 7676, pp. 354–359, 2017.
[9] D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan and D. Hassabis, Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, arXiv:1712.01815v1.
[10] Noam Brown, Tuomas Sandholm, Superhuman AI for Heads-up No-limit Poker: Libratus Beats Top Professionals, Science, Vol. 359, pp. 418-424, 2018.
[11] Junjie Li, Sotetsu Koyamada, Suphx: Mastering Mahjong with Deep Reinforcement Learning, arXiv:2003.13590, 2020.
[12] Sanjiang Li and Xueqing Yan, Let’s Play Mahjong, Computing Research Repository(CoRR abs/1903.03294), 2019.
[13] Naoki Mizukami and Yoshimasa Tsuruoka, Building a Computer Mahjong Player Based on Monte Carlo Simulation and Opponent Models, IEEE Conference on Computational Intelligence and Games(CIG), pp. 275-283, 2015.
[14] Naoki Mizukami and Yoshimasa Tsuruoka, Computer Mahjong Players with Winning Strategies Based on Reinforcement Learning, The 21st Game Programming Workshop, pp. 81-88, 2016.