簡易檢索 / 詳目顯示

研究生: 程健倫
Cheng, Chien-Lun
論文名稱: 基於深度強化式學習之多目標人群導航機器人系統
Multi-Objective Crowd-Aware Robot Navigation System Using Deep Reinforcement Learning
指導教授: 許陳鑑
Hsu, Chen-Chien
口試委員: Saeed Saeedvand
Saeed Saeedvand
呂成凱
Lu, Cheng-Kai
蔡奇謚
Tsai, Chi-Yi
許陳鑑
Hsu, Chen-Chien
口試日期: 2023/07/03
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 66
中文關鍵詞: 深度強化式學習具行人意識之動作規劃人機互動機器人避障
英文關鍵詞: Deep Reinforcement Learning, Human Aware Motion Planning, Human-Robot Interaction, Obstacle Avoidance
研究方法: 實驗設計法比較研究
DOI URL: http://doi.org/10.6345/NTNU202301132
論文種類: 學術論文
相關次數: 點閱:131下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自主移動機器人(AMR)由於其多功能性,已成功引起了人們的關注,目前已廣泛應用於自動化工廠和人與機器人之共存環境,如機場和購物中心等。為了使機器人能夠在人群環境中進行導航,機器人必須具有社交意識並能夠預測行人的移動。然而,以往的方法,機器人都需要先預測行人未來軌跡,再規劃安全路徑,常會受到行人移動之高度隨機性的影響,導致計算成本增加和機器人凍結的問題。隨著深度學習的發展,許多與導航有關的研究都基於深度強化式學習,使機器人可以通過與環境的互動找到最佳策略。社交關注強化式學習(SARL)是一最先進的(state-of-the-art)方法,能夠提升機器人在人群環境中的導航能力。儘管SARL成功改善了人群環境下的導航效能,但它仍然存在幾項缺點。因此,本研究提出了一種基於深度強化學習的多目標人群導航機器人系統,藉由所提出之獎勵函數以實現多個導航目標,包括安全性、時間效率、避免碰撞和路徑平滑度等。為了解決人群環境中的導航延遲,我們也開發了一多目標雙重選擇注意力模組(MODSRL),使機器人能夠做出更有效的決策,同時減少在導航初始階段的徘迴問題。實驗結果表示,所提出的MODSRL方法在五個不同的指標上優於現有的研究,展現了在複雜人群環境中導航的強健性。

    Autonomous mobile robots (AMRs) are gaining attention due to their versatile capabilities, making them suitable for various applications in automated factories and human-robot coexistence environments such as airports and shopping malls. To enable robots to navigate in crowded environments, robots must be socially aware and able to predict the movements of pedestrians. However, previous methods, such as predicting future trajectories of pedestrians and then planning a safe path, encountered challenges due to the high randomness of pedestrian movements, resulting in increased computational costs and the problem of robot freezing. With the development of deep learning, many navigation-related studies have been based on deep reinforcement learning, allowing machines to find the optimal strategy through interaction with the environment during navigation. Socially Attentive Reinforcement Learning (SARL) is a promising method for enhancing navigation in crowded environments. While SARL has been successful in improving navigation performance in crowded environments, it still has several shortcomings. This study proposes a deep reinforcement learning-based multi-objective crowd-aware robot navigation system. The proposed method uses a set of reward functions to reach multiple objectives, including safety, time efficiency, collision avoidance, and path smoothness during navigation. To address the challenge of hesitation in crowd environments, we develop a Multi-Objective Dual-Selection Attention Module called MODSRL, which enables the robot to make efficient decisions while reducing hesitation. Experimental results demonstrate that the proposed MODSRL method outperforms existing research studies in five different metrics, showcasing its robustness in navigating complex crowd environments.

    第一章 緒論 1 1.1 研究背景 1 1.2 研究目的 4 1.3 論文架構 5 第二章 文獻回顧 6 2.1 傳統方法 6 2.1.1 基於動作反應 6 2.1.2 基於軌跡預測 9 2.2 深度強化式學習法 11 2.3 多目標深度強化式學習法 13 第三章 多目標人群導航策略 15 3.1 人群環境模擬演算法設計 15 3.1.1 以ORCA為基礎之人群模擬 15 3.1.2 考慮行人情緒之人群模擬 20 3.2 社交注意力模組之深度神經網路架構 23 3.2.1 交互執行模組 (Interactive Execution Module) 24 3.2.2 雙層選擇注意力模組 (Dual-Selection Attention Module) 25 3.2.3 動作規劃模組 (Action Planning Module) 29 3.3 深度V學習算法 (Deep V-Learning) 29 3.3.1 模仿式學習 31 3.3.2 強化式學習 32 3.3.3 多目標獎勵函數與權衡評估設計 34 第四章 實驗結果 38 4.1 實驗架構 38 4.1.1 軟體系統 38 4.1.2 硬體系統 39 4.2 固定與隨機行人速度環境實驗 39 4.2.1 固定環境 40 4.2.2 隨機環境 48 4.3 安全與時間效率權衡實驗 58 第五章 結論與未來展望 61 5.1 結論 61 5.2 未來展望 61 參考文獻 62

    R. C. Arkin and R. R. Murphy, “Autonomous navigation in a manufacturing environment,” in IEEE Transactions on Robotics and Automation, vol. 6, no. 4, pp. 445-454, Aug. 1990.
    J. Wang and M. Q.-H. Meng, “Socially Compliant Path Planning for Robotic Autonomous Luggage Trolley Collection at Airports,” Sensors, vol. 19, no. 12. MDPI AG, p. 2759, Jun. 19, 2019.
    J. Forlizzi and C. DiSalvo, “Service robots in the domestic environment,” Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction. ACM, Mar. 02, 2006.
    Sterilization robot - Passenger Terminal Today. Available from: https://www.passengerterminaltoday.com/videos/sterilization-robot.html.
    H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part I,” in IEEE Robotics & Automation Magazine, vol. 13, no. 2, pp. 99-110, June 2006.
    F. Dellaert, D. Fox, W. Burgard and S. Thrun, “Monte Carlo localization for mobile robots,” Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C), Detroit, MI, USA, 1999, vol.2, pp. 1322-1328.
    K. Zhu and T. Zhang, “Deep reinforcement learning based mobile robot navigation: A review,” in Tsinghua Science and Technology, vol. 26, no. 5, pp. 674-691, Oct. 2021.
    A. Faust et al., “PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning,” arXiv, 2017.
    OpenAI Gym. Available from: https://github.com/openai/gym.
    C. Mavrogiannis et al., “Core Challenges of Social Robot Navigation: A Survey.” arXiv, 2021.
    D. Helbing and P. Molnár, “Social force model for pedestrian dynamics,” in Physical Review E, American Physical Society (APS), vol. 51, no. 5, pp. 4282–4286, May 01, 1995.
    J. van den Berg, Ming Lin and D. Manocha, “Reciprocal Velocity Obstacles for real-time multi-agent navigation,” 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, 2008, pp. 1928-1935.
    J. van den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-Body Collision Avoidance,” in Robotics Research, ser. Springer Tracts in Advanced Robotics, C. Pradalier, R. Siegwart, and G. Hirzinger, Eds. Springer Berlin Heidelberg, 2011, pp. 3–19.
    P. Trautman, J. Ma, R. M. Murray and A. Krause, “Robot navigation in dense human crowds: the case for cooperation,” 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 2013, pp. 2153-2160.
    A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei and S. Savarese, “Social LSTM: Human Trajectory Prediction in Crowded Spaces,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 961-971.
    A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks.” arXiv, 2018.
    P. Trautman and A. Krause, “Unfreezing the robot: Navigation in dense, interacting crowds,” 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 2010, pp. 797-803.
    Y. F. Chen, M. Liu, M. Everett, and J. P. How, “Decentralized Non-communicating Multiagent Collision Avoidance with Deep Reinforcement Learning.” arXiv, 2016.
    Y. F. Chen, M. Everett, M. Liu, and J. P. How, “Socially Aware Motion Planning with Deep Reinforcement Learning.” arXiv, 2017.
    M. Everett, Y. F. Chen, and J. P. How, “Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning.” arXiv, 2018.
    C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning.” arXiv, 2018.
    H. Zeng, R. Hu, X. Huang, and Z. Peng, “Robot Navigation in Crowd Based on Dual Social Attention Deep Reinforcement Learning,” Mathematical Problems in Engineering, Hindawi Limited, vol. 2021, pp. 1–11, Sep. 24, 2021.
    Y. lin, S. Song, J. Yao, Q. Chen, and L. Zheng, “Robot Navigation in Crowd via DeepReinforcement Learning.” Research Square Platform LLC, Jun. 27, 2022.
    L. Kästner, J. Li, Z. Shen, and J. Lambrecht, “Enhancing Navigational Safety in Crowded Environments using Semantic-Deep-Reinforcement-Learning-based Navigation.” arXiv, 2021.
    S. S. Samsani, “On Safety and Time Efficiency Enhancement of Robot Navigation in Crowded Environment utilizing Deep Reinforcement Learning.” Institute of Electrical and Electronics Engineers (IEEE), Dec. 28, 2021.
    S. S. Samsani and M. S. Muhammad,“Socially Compliant Robot Navigation in Crowded Environment by Human Behavior Resemblance Using Deep Reinforcement Learning,” in IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 5223-5230, July 2021.
    K. Van Moffaert, M. M. Drugan and A. Nowé, “Scalarized multi-objective reinforcement learning: Novel design techniques,” 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), Singapore, 2013, pp. 191-199.
    T. T. Nguyen, N. D. Nguyen, P. Vamplew, S. Nahavandi, R. Dazeley, and C. P. Lim, “A Multi-Objective Deep Reinforcement Learning Framework,” arXiv, 2018.
    A. Ramezani Dooraki and D.-J. Lee, “A Multi-Objective Reinforcement Learning Based Controller for Autonomous Navigation in Challenging Environments,” Machines, vol. 10, no. 7, pp. 500, Jun. 2022.
    Y. Chen, F. Zhao and Y. Lou, “Interactive Model Predictive Control for Robot Navigation in Dense Crowds,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 4, pp. 2289-2301, April 2022.
    M. Xu et al., “Crowd Behavior Simulation With Emotional Contagion in Unexpected Multihazard Situations,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1567-1581, March 2021.
    M. Nishimura and R. Yonetani, “L2B: Learning to Balance the Safety-Efficiency Trade-off in Interactive Crowd-aware Robot Navigation.” arXiv, 2020.

    無法下載圖示 電子全文延後公開
    2028/08/04
    QR CODE