簡易檢索 / 詳目顯示

研究生: 許哲菡
Hanjaya Mandala
論文名稱: 基於深度強化學習之移動大型重物
Moving Large Size and Heavy Object with Deep Reinforcement Learning
指導教授: 包傑奇
Jacky Baltes
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 80
中文關鍵詞: 人形機器人深度強化學習拖動物件深度學習
英文關鍵詞: humanoid robot, deep reinforcement learning, dragging object, deep learning
DOI URL: http://doi.org/10.6345/NTNU202000829
論文種類: 學術論文
相關次數: 點閱:158下載:31
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Humanoid robots are designed and expected to work alongside a human. In our daily life, Moving Large Size and Heavy Objects (MLHO) can be considered as a problem that is a common activity and dangerous to humans. In this thesis, we propose a novel hierarchical learning-based algorithm, which we use dragging to transport an object on an adult-sized humanoid robot. The proposed method proves robustness on a THORMANG-Wolf adult-sized humanoid robot, that manages to drag a massive object with a mass of double of its weight (84.6 kg) for 2 meters. Therefore, the algorithms consist of three hierarchical deep learning-based algorithms to solve the MLHO problem and distributed in terms of robot vision and behavior control. Based on this insight, in the robot vision control, first, we propose deep learning algorithms to 3D object classification and surface detection.

    For 3D object classification, we propose a Three-layers Convolution Volumetric Network (TCVN). Input data of the TCVN model used a voxel grid representation from point clouds data acquired from the robot’s LiDAR scanner. On the other hand, for surface detection, we propose a lightweight real-time instance segmentation called Tiny-YOLACT (You Only Look at Coefficients) to segment the floor from the robot’s camera. Tiny-YOLACT model is adopted from the YOLACT model and utilized ResNet-18 model as the backbone network. Furthermore, for robot behavior control, as the main part of this thesis we address solving MLHO problem by an adult-sized humanoid robot using the deep reinforcement learning algorithm for the first time. At this part, we proposed a Deep Q-Learning algorithm to train a deep model for control policy in offsetting the Centre of Body (CoB) of the robot when dragging different objects named (DQL-COB). For this purpose, the offset CoB is implemented to keep tracking with the robot’s center of mass. As a result, the robot can keep balance with maintaining the ZMP in the support polygon. DQL-COB algorithm was first trained on the ROS Gazebo simulator to avoid costly experiments in terms of time and real environment constraints, then it was adopted with a real robot on three different types of surfaces.

    To evaluate the stability of the THORMANG-Wolf robot with the proposed methods, we evaluated two types of experiments on three types of surfaces with eight different objects. In these experiments, in one scenario we use IMU along with foot Pressure (F/T) sensor, in the second scenario we just use IMU data as learning algorithm input. In the experiments, the success rates of applying the DQL-COB algorithm on the real robot are 92.91% with using the F/T sensor and 83.75% without using F/T sensors. Moreover, the TCVN model on 3D object classifications achieved a 90% accuracy in real-time. Correspondingly, the Tiny-YOLACT model achieved a 34.16 mAP on validation data with an average of 29.56 fps on a single NVIDIA GTX-1060 GPU.

    Table of Contents Acknowledgment i ABSTRACT ii Table of Contents iv List of Figures vi List of Tables viii Chapter 1: Introduction 1 1.1. Background 1 1.2. Problem statement 2 1.3. The objective of the study 4 1.4. Limitation of the study 5 Chapter 2: Literature Review 6 2.1. Related work 6 2.1.1. Pushing object 6 2.1.2. Pivoting object 8 2.1.3. Teleoperation manipulation 9 2.1.4. Walking Balance (Learning-Based) 9 2.1.5. Push Recovery (Learning-Based) 10 2.1.6. Summary of related work 11 2.2. Inverse Kinematic 12 2.3. Walking Gait 14 2.4. Neural Network 16 2.5. Deep Learning 17 2.6. Object Detection 19 2.7. Reinforcement Learning 20 2.8. Deep Reinforcement Learning 22 Chapter 3: Methodology 23 3.1. THORMANG-Wolf Robot 23 3.1.1. Hardware Description 23 3.1.2. Software Description 26 3.1.3. The Proposed Algorithm Design 27 3.2. Robot Vision Process 30 3.2.1. 3D Object Detection (Deep Learning) 30 3.2.2. Floor Detection (Deep Learning) 35 3.3. Robot Motion Control 38 3.3.1. Object Grasping 38 3.3.2. Walking Control 39 3.4. Robot Behavior Control 42 3.4.1. DQL-COB Algorithm Design 43 Chapter 4: Experimental Result 54 4.1. Experimental Setup 55 4.2. Experimental Result for Robot Vision 57 4.2.1. 3D Object Classification Result 57 4.2.2. Floor Detection Result 59 4.3. Experimental Results for Robot Behavior 62 4.3.1. DQL-COB Training Results 62 4.3.2. DQL-COB Empirical Evaluation Result 66 Chapter 5: Closing 73 5.1. Conclusion 73 5.2. Future Work 74 Bibliographies 75 Autobiography 79 Academic Achievement 80

    Bibliographies

    [1] K. Tanie, "Humanoid robot and its application possibility," in Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI2003., 2003, pp. 213-214.
    [2] A. Choudhury, H. Li, C. Greene, and S. Perumalla, "Humanoid Robot-Application and Influence," arXiv preprint arXiv:1812.06090, 2018.
    [3] S. Kajita, H. Hirukawa, K. Harada, and K. Yokoi, Introduction to humanoid robotics. Springer, 2014.
    [4] T. Takubo, K. Inoue, and T. Arai, "Pushing an Object Considering the Hand Reflect Forces by Humanoid Robot in Dynamic Walking," in Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, Spain, 2005, pp. 1706-1711.
    [5] N. Motoi, M. Ikebe, and K. Ohnishi, "Real-Time Gait Planning for Pushing Motion of Humanoid Robot," IEEE Transactions on Industrial Informatics, vol. 3, no. 2, pp. 154-163, 2007.
    [6] D. Omrčen, C. Böge, T. Asfour, A. Ude, and R. Dillmann, "Autonomous acquisition of pushing actions to support object grasping with a humanoid robot," in 2009 9th IEEE-RAS International Conference on Humanoid Robots, 2009, pp. 277-283.
    [7] S. Nozawa, Y. Kakiuchi, K. Okada, and M. Inaba, "Controlling the planar motion of a heavy object by pushing with a humanoid robot using dual-arm force control," in 2012 IEEE International Conference on Robotics and Automation, 2012, pp. 1428-1435.
    [8] M. Murooka, S. Nozawa, Y. Kakiuchi, K. Okada, and M. Inaba, "Whole-body pushing manipulation with contact posture planning of large and heavy object for humanoid robot," in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5682-5689.
    [9] K. Harada et al., "A Humanoid Robot Carrying a Heavy Object," in Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, Spains, 2005, pp. 1712-1717.
    [10] J. C. Vaz, H. Lee, Y. Jun, and P. Oh, "Towards tasking humanoids for lift-and-carry non-rigid material," in 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), 2017, pp. 316-321.
    [11] Y. Ohmura and Y. Kuniyoshi, "Humanoid robot which can lift a 30kg box by whole body contact and tactile feedback," in 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 2007, pp. 1136-1141.
    [12] A. Laurenzi, D. Kanoulas, E. M. Hoffman, L. Muratore, and N. G. Tsagarakis, "Whole-Body Stabilization for Visual-Based Box Lifting with the COMAN+ Robot," in 2019 Third IEEE International Conference on Robotic Computing (IRC), Naples, Italy, Italy, 2019, pp. 445-446.
    [13] M. De Looze, K. Van Greuningen, J. Rebel, I. Kingma, and P. Kuijer, "Force direction and physical load in dynamic pushing and pulling," Ergonomics, vol. 43, no. 3, pp. 377-390, 2000.
    [14] A. Argubi-Wollesen, B. Wollesen, M. Leitner, and K. Mattes, "Human body mechanics of pushing and pulling: analyzing the factors of task-related strain on the musculoskeletal system," Safety and health at work, vol. 8, no. 1, pp. 11-18, 2017.
    [15] D. Torricelli et al., "Human-like compliant locomotion: state of the art of robotic implementations," Bioinspiration & biomimetics, vol. 11, no. 5, p. 051002, 2016.
    [16] K. Harada, S. Kajita, K. Kaneko, and H. Hirukawa, "Dynamics and balance of a humanoid robot during manipulation tasks," IEEE Transactions on Robotics, vol. 22, no. 3, pp. 568-575, 2006.
    [17] J. Yang, S. Ogawa, T. Tsujita, S. Komizunai, and A. Konno, "Massive object transportation by a humanoid robot," IFAC-PapersOnLine, vol. 51, no. 22, pp. 250-255, 2018.
    [18] F. Abi-Farraj, B. Henze, C. Ott, P. R. Giordano, and M. A. Roa, "Torque-Based Balancing for a Humanoid Robot Performing High-Force Interaction Tasks," IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 2023-2030, 2019.
    [19] E. Yoshida, P. Blazevic, V. Hugel, K. Yokoi, and K. Harada, "Pivoting a large object: whole-body manipulation by a humanoid robot," Applied Bionics and Biomechanics, vol. 3, no. 3, pp. 227-235, 2006.
    [20] E. Yoshida et al., "Motion planning for whole body tasks by humanoid robot," in IEEE International Conference Mechatronics and Automation, 2005, Niagara Falls, Ont., Canada, 2005, vol. 4, pp. 1784-1789: IEEE.
    [21] M. Stilman, K. Nishiwaki, and S. Kagami, "Humanoid teleoperation for whole body manipulation," in 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 2008, pp. 3175-3180: IEEE.
    [22] M. Stilman, K. Nishiwaki, and S. Kagami, "Learning object models for whole body manipulation," in 2007 7th IEEE-RAS International Conference on Humanoid Robots, 2007, pp. 174-179.
    [23] E. Berger, "Friction modeling for dynamic system simulation," Applied Mechanics Reviews, vol. 55, no. 6, pp. 535-577, 2002.
    [24] J. Woodhouse, T. Putelat, and A. McKay, "Are there reliable constitutive laws for dynamic friction?," Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 373, no. 2051, p. 20140401, 2015.
    [25] S. Kim, S. Hong, and D. Kim, "A walking motion imitation framework of a humanoid robot by human walking recognition from IMU motion data," in 2009 9th IEEE-RAS International Conference on Humanoid Robots, 2009, pp. 343-348.
    [26] B. Hengst, M. Lange, and B. White, "Learning ankle-tilt and foot-placement control for flat-footed bipedal balancing and walking," in 2011 11th IEEE-RAS International Conference on Humanoid Robots, 2011, pp. 288-293.
    [27] J. Lin, K. Hwang, W. Jiang, and Y. Chen, "Gait Balance and Acceleration of a Biped Robot Based on Q-Learning," IEEE Access, vol. 4, pp. 2439-2449, 2016.
    [28] W. Wu and L. Gao, "Posture self-stabilizer of a biped robot based on training platform and reinforcement learning," Robotics and Autonomous Systems, vol. 98, pp. 42-55, 2017.
    [29] K. Hwang, W. Jiang, Y. Chen, and H. Shi, "Motion Segmentation and Balancing for a Biped Robot's Imitation Learning," IEEE Transactions on Industrial Informatics, vol. 13, no. 3, pp. 1099-1108, 2017.
    [30] X. Wu, S. Liu, T. Zhang, L. Yang, Y. Li, and T. Wang, "Motion Control for Biped Robot via DDPG-based Deep Reinforcement Learning," in 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), 2018, pp. 40-45.
    [31] C.-C. Wong, C.-C. Liu, S.-R. Xiao, H.-Y. Yang, and M.-C. Lau, "Q-Learning of Straightforward Gait Pattern for Humanoid Robot Based on Automatic Training Platform," Electronics, vol. 8, no. 6, p. 615, 2019.
    [32] M. Nakada, B. Allen, S. Morishima, and D. Terzopoulos, "Learning Arm Motion Strategies for Balance Recovery of Humanoid Robots," in 2010 International Conference on Emerging Security Technologies, 2010, pp. 165-170.
    [33] S. Yi, B. Zhang, D. Hong, and D. D. Lee, "Online learning of a full body push recovery controller for omnidirectional walking," in 2011 11th IEEE-RAS International Conference on Humanoid Robots, 2011, pp. 1-6.
    [34] D. Luo, X. Han, Y. Ding, Y. Ma, Z. Liu, and X. Wu, "Learning push recovery for a bipedal humanoid robot with Dynamical Movement Primitives," in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015, pp. 1013-1019.
    [35] P. Mendez-Monroy, "Walking motion generation and neuro-fuzzy control with push recovery for humanoid robot," Int. J. Comput. Commun., vol. 12, no. 3, pp. 330-346, 2017.
    [36] H. Kim, D. Seo, and D. Kim, "Push Recovery Control for Humanoid Robot Using Reinforcement Learning," in 2019 Third IEEE International Conference on Robotic Computing (IRC), 2019, pp. 488-492.
    [37] P. Beeson and B. Ames, "TRAC-IK: An open-source library for improved solving of generic inverse kinematics," in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015, pp. 928-935.
    [38] J.-Y. Kim, I.-W. Park, and J.-H. Oh, "Walking Control Algorithm of Biped Humanoid Robot on Uneven and Inclined Floor," Journal of Intelligent and Robotic Systems, vol. 48, no. 4, pp. 457-484, 2007/04/01 2007.
    [39] M. Vukobratović and B. Borovac, "Zero-moment point—thirty five years of its life," International journal of humanoid robotics, vol. 1, no. 01, pp. 157-173, 2004.
    [40] J. Darrell, J. Long, and E. Shelhamer, "Fully Convolutional Networks for Semantic Segmentation," IEEE T PATTERN ANAL, vol. 39, no. 4, 2014.
    [41] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," in Advances in neural information processing systems, 2015, pp. 91-99.
    [42] W. Liu et al., "Ssd: Single shot multibox detector," in European conference on computer vision, 2016, pp. 21-37: Springer.
    [43] J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
    [44] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980-2988.
    [45] Y. Li, H. Qi, J. Dai, X. Ji, and Y. Wei, "Fully Convolutional Instance-Aware Semantic Segmentation," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4438-4446.
    [46] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, "YOLACT: Real-Time Instance Segmentation," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9156-9165.
    [47] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "Pointnet: Deep learning on point sets for 3d classification and segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652-660.
    [48] C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992/05/01 1992.
    [49] V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, p. 529, 2015.
    [50] H. Mao, M. Alizadeh, I. Menache, and S. Kandula, "Resource management with deep reinforcement learning," in Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016, pp. 50-56.
    [51] ROBOTIS Inc. (December 19). THORMANG3 Full size open platform humanoid. Available: http://en.robotis.com/model/page.php?co_id=prd_thormang3#
    [52] D. Maturana and S. Scherer, "Voxnet: A 3d convolutional neural network for real-time object recognition," in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 2015, pp. 922-928: IEEE.
    [53] V. Hegde and R. Zadeh, "Fusionnet: 3d object classification using multiple data representations," arXiv preprint arXiv:1607.05695, 2016.
    [54] X.-F. Han, J. Jin, M.-J. Wang, W. Jiang, L. Gao, and L. Xiao, "A review of algorithms for filtering the 3D point cloud," Signal Processing: Image Communication, vol. 57, 05/01 2017.
    [55] B. Chakraborty, B. Shaw, J. Aich, U. Bhattacharya, and S. K. Parui, "Does Deeper Network Lead to Better Accuracy: A Case Study on Handwritten Devanagari Characters," in 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), 2018, pp. 411-416.
    [56] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)," ed, 2015.
    [57] M. Vukobratovic and B. Borovac, "Zero-Moment Point - Thirty Five Years of its Life," I. J. Humanoid Robotics, vol. 1, pp. 157-173, 03/01 2004.
    [58] S. Kajita et al., Biped walking pattern generation by using preview control of zero-moment point. 2003, pp. 1620-1626 vol.2.

    下載圖示
    QR CODE