研究生: |
陳玠宇 Chen, Chieh-Yu |
---|---|
論文名稱: |
蜜月橋牌程式叫牌與換牌階段的策略改進 Strategy Improvements in the Bidding and Exchanging Stages of the Honeymoon Bridge Program |
指導教授: |
林順喜
Lin, Shun-Shii |
口試委員: |
周信宏
Chou, Hsin-Hung 顏士淨 Yen, Shi-Jim 吳毅成 Wu, I-Chen 張紘睿 Chang, Hung-Jui 陳志昌 Chen, Jr-Chang 林順喜 Lin, Shun-Shii |
口試日期: | 2022/08/03 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 57 |
中文關鍵詞: | 電腦對局 、位元棋盤 、殘局庫 、蜜月橋牌 、不完全資訊賽局 |
英文關鍵詞: | Computer games, Bitboard, Endgame database, Honeymoon Bridge, Incomplete information games |
研究方法: | 實驗設計法 |
DOI URL: | http://doi.org/10.6345/NTNU202201230 |
論文種類: | 學術論文 |
相關次數: | 點閱:94 下載:26 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
不完全資訊賽局在當前的研究中仍存在許多尚須攻克的難點,其中大量存在的可能性狀態就是一個需要克服的難關。本研究希望透過對蜜月橋牌這項遊戲的研究來加深對不完全資訊賽局的了解並找到一些方法來處理爆炸性增長的狀態的問題。
蜜月橋牌是一種三階段的遊戲,在每個階段中遊戲性質都會發生變化。本研究透過蜜月橋牌特性,成功完成及時分析換牌階段單一層的殘局庫全搜索,並撰寫了全新的蜜月橋牌程式,採用了bitboard的形式來實現,這大幅提升了程式的效能,並將程式讀取殘局庫的效能提升至每秒三千萬次的搜索速度。
本研究利用打牌階段的資訊來代替使用人類經驗所建立牌力表,並使用取樣搜索的方式來判斷可執行行為的好壞,以此方法來使程式操作在打牌階段脫離人類經驗,這使得程式可以做到人類經驗以外的好步,大大提升了程式在換牌階段的能力。
在經過調整叫牌階段策略與換牌階段策略後蜜月橋牌程式整體的對戰能力已經有著不錯的提升,在對戰人類玩家時有著不錯的勝率,並對戰先前的程式中也能保持超過六成的勝率。
There are still many difficulties to be overcome in the current research on games with incomplete information, among which the existence of a large number of possible states is a difficulty that needs to be overcome. This study hopes to deepen the understanding of incomplete information games and find some ways to deal with the problem of explosive growth by studying the game of Honeymoon Bridge.
Honeymoon Bridge is a three-stage game where the nature of the game changes during each stage. Through the characteristics of Honeymoon Bridge, this study successfully completed the timely analysis of the full search of the single-level endgame database in the exchanging stage. We wrote a new Honeymoon Bridge program using the form of bitboard which greatly improved the performance of the program and made the program more efficient. The performance of reading the endgame database has been increased to a speed of 30 million searches per second.
This study uses the information of the playing stage to replace the card score table established by human experience and uses the method of sampling search to judge the quality of the executable behavior. This method makes the program's operation separate from human experience in the playing stage, and allows the program to do better than human experience and greatly improves the program's ability in the exchanging stage.
After implementing the strategies of the bidding and the exchanging stages, the overall strength of the Honeymoon Bridge program has been greatly improved. It has a good win rate against human players and can also get more than 60% win rate against the previous programs.
[1] Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T. and Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), pp.484-489.
[2] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche, G., Graepel, T. and Hassabis, D. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), pp.354-359.
[3] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K. and Hassabis, D. (2018). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. [online] Arxiv.org. https://arxiv.org/abs/1712.01815.
[4] Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., and Silver, D. Mastering atari, go, chess and shogi by planning with a learned model. 2020. URL http://arxiv.org/ abs/1911.08265.
[5] Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S. and Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1), pp.1-43.
[6] Junjie Li, Sotetsu Koyamada, Qiwei Ye, Guoqing Liu, Chao Wang, Ruihan Yang, Li Zhao, Tao Qin, Tie-Yan Liu, Hsiao-Wuen Hon. Suphx: Mastering Mahjong with Deep Reinforcement Learning. 2020 URL, https://arxiv.org/abs/2003.13590.
[7] 合約橋牌,https://en.wikipedia.org/wiki/Contract_bridge。
[8] 楊承恩,蜜月橋牌程式改良與強化學習應用,2019,國立台灣師範大學資工所碩士論文。
[9] 葉俊廷,不完全資訊賽局蜜月橋牌之研究,2009。國立臺灣師範大學資工所碩士論文。
[10] 林澤沅,蜜月橋牌考慮無王並改良各階段演算法之研究與實作,2014。 國立臺灣師範大學資工所碩士論文。
[11] Yen-Chi Chen, Jia-Fong Yeh and Shun-Shii Lin, “Design and Implementation Aspects of a Surakarta Program,” ICGA Journal, vol.40, no.4, pp.438-449, March 25, 2019.
[12] DarkChess電腦暗棋對戰平台。https://web.ntpu.edu.tw/~jcchen/clients/ver5/Readme.pdf