簡易檢索 / 詳目顯示

研究生: 陳玠宇
Chen, Chieh-Yu
論文名稱: 蜜月橋牌程式叫牌與換牌階段的策略改進
Strategy Improvements in the Bidding and Exchanging Stages of the Honeymoon Bridge Program
指導教授: 林順喜
Lin, Shun-Shii
口試委員: 周信宏
Chou, Hsin-Hung
顏士淨
Yen, Shi-Jim
吳毅成
Wu, I-Chen
張紘睿
Chang, Hung-Jui
陳志昌
Chen, Jr-Chang
林順喜
Lin, Shun-Shii
口試日期: 2022/08/03
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 57
中文關鍵詞: 電腦對局位元棋盤殘局庫蜜月橋牌不完全資訊賽局
英文關鍵詞: Computer games, Bitboard, Endgame database, Honeymoon Bridge, Incomplete information games
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202201230
論文種類: 學術論文
相關次數: 點閱:122下載:26
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 不完全資訊賽局在當前的研究中仍存在許多尚須攻克的難點,其中大量存在的可能性狀態就是一個需要克服的難關。本研究希望透過對蜜月橋牌這項遊戲的研究來加深對不完全資訊賽局的了解並找到一些方法來處理爆炸性增長的狀態的問題。
    蜜月橋牌是一種三階段的遊戲,在每個階段中遊戲性質都會發生變化。本研究透過蜜月橋牌特性,成功完成及時分析換牌階段單一層的殘局庫全搜索,並撰寫了全新的蜜月橋牌程式,採用了bitboard的形式來實現,這大幅提升了程式的效能,並將程式讀取殘局庫的效能提升至每秒三千萬次的搜索速度。
    本研究利用打牌階段的資訊來代替使用人類經驗所建立牌力表,並使用取樣搜索的方式來判斷可執行行為的好壞,以此方法來使程式操作在打牌階段脫離人類經驗,這使得程式可以做到人類經驗以外的好步,大大提升了程式在換牌階段的能力。
    在經過調整叫牌階段策略與換牌階段策略後蜜月橋牌程式整體的對戰能力已經有著不錯的提升,在對戰人類玩家時有著不錯的勝率,並對戰先前的程式中也能保持超過六成的勝率。

    There are still many difficulties to be overcome in the current research on games with incomplete information, among which the existence of a large number of possible states is a difficulty that needs to be overcome. This study hopes to deepen the understanding of incomplete information games and find some ways to deal with the problem of explosive growth by studying the game of Honeymoon Bridge.
    Honeymoon Bridge is a three-stage game where the nature of the game changes during each stage. Through the characteristics of Honeymoon Bridge, this study successfully completed the timely analysis of the full search of the single-level endgame database in the exchanging stage. We wrote a new Honeymoon Bridge program using the form of bitboard which greatly improved the performance of the program and made the program more efficient. The performance of reading the endgame database has been increased to a speed of 30 million searches per second.
    This study uses the information of the playing stage to replace the card score table established by human experience and uses the method of sampling search to judge the quality of the executable behavior. This method makes the program's operation separate from human experience in the playing stage, and allows the program to do better than human experience and greatly improves the program's ability in the exchanging stage.
    After implementing the strategies of the bidding and the exchanging stages, the overall strength of the Honeymoon Bridge program has been greatly improved. It has a good win rate against human players and can also get more than 60% win rate against the previous programs.

    一、緒論 1 1.1 研究背景 1 1.2 研究目的與動機 2 二、文獻探討 4 2.1 蜜月橋牌介紹 4 2.2 蜜月橋牌階段分析 5 2.3 殘局庫 7 2.3.1 殘局庫的用語與早期的殘局庫 7 2.3.2 縮減重複牌型 9 2.3.3 加速與資料壓縮 12 2.3.4 無王殘局庫 15 2.4 MUZERO介紹 17 三、方法與步驟 20 3.1 蜜月橋牌打牌階段 20 3.1.1 資料結構 20 3.1.2 殘局庫索引 23 3.1.3 程式效率與打牌階段 24 3.2 換牌階段 26 3.2.1 單一輪換牌的所有可能性全展開 26 3.2.2 先後手的應對與對手手牌可能性問題 28 3.2.3 對對手可能的手牌隨機取樣 29 3.2.4 換牌階段出牌範圍縮減 32 3.2.5 對手手牌配牌機率與強化配牌 33 3.3 叫牌階段設計 35 3.3.1 大牌與小牌之間的價值與特殊操作 37 3.3.2 叫牌階段牌力表 38 四、實驗與比賽結果 40 4.1 環境設定 40 4.2 對戰測試 41 4.2.1 不同程式間的對戰平台 41 4.2.2 隨機配牌與強化配牌 45 4.2.3 Jade Hare對戰 BridgeYeh 47 4.2.4 Jade Hare對戰 GodJimmy 48 4.3 TCGA 2022與ICGA 2022 49 4.3.1 TCGA比賽紀錄 49 4.3.2 ICGA比賽紀錄 52 4.3.3 賽後感想 52 五、未來方向 54 5.1 換牌階段改進方向 54 5.2 叫牌階段的牌力表 54 六、參考文獻 56

    [1] Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T. and Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), pp.484-489.
    [2] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche, G., Graepel, T. and Hassabis, D. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), pp.354-359.
    [3] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K. and Hassabis, D. (2018). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. [online] Arxiv.org. https://arxiv.org/abs/1712.01815.
    [4] Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., and Silver, D. Mastering atari, go, chess and shogi by planning with a learned model. 2020. URL http://arxiv.org/ abs/1911.08265.
    [5] Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S. and Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1), pp.1-43.
    [6] Junjie Li, Sotetsu Koyamada, Qiwei Ye, Guoqing Liu, Chao Wang, Ruihan Yang, Li Zhao, Tao Qin, Tie-Yan Liu, Hsiao-Wuen Hon. Suphx: Mastering Mahjong with Deep Reinforcement Learning. 2020 URL, https://arxiv.org/abs/2003.13590.
    [7] 合約橋牌,https://en.wikipedia.org/wiki/Contract_bridge。
    [8] 楊承恩,蜜月橋牌程式改良與強化學習應用,2019,國立台灣師範大學資工所碩士論文。
    [9] 葉俊廷,不完全資訊賽局蜜月橋牌之研究,2009。國立臺灣師範大學資工所碩士論文。
    [10] 林澤沅,蜜月橋牌考慮無王並改良各階段演算法之研究與實作,2014。 國立臺灣師範大學資工所碩士論文。
    [11] Yen-Chi Chen, Jia-Fong Yeh and Shun-Shii Lin, “Design and Implementation Aspects of a Surakarta Program,” ICGA Journal, vol.40, no.4, pp.438-449, March 25, 2019.
    [12] DarkChess電腦暗棋對戰平台。https://web.ntpu.edu.tw/~jcchen/clients/ver5/Readme.pdf

    下載圖示
    QR CODE