研究生: |
林裕仁 Lin, Yu-Ren |
---|---|
論文名稱: |
尤里卡博士:人形機器人的操縱案例研究 Dr. Eureka: A Humanoid Robot Manipulation Case Study |
指導教授: |
包傑奇
Jacky Baltes |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 56 |
中文關鍵詞: | 機器學習 、反向運動學 、動態運動學 |
英文關鍵詞: | Machine learning, Inverse Kinematics, Dynamic Movement Primitives |
DOI URL: | http://doi.org/10.6345/NTNU201900552 |
論文種類: | 學術論文 |
相關次數: | 點閱:166 下載:21 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
直到今天,操縱仍然是人形機器人的技術中最難的挑戰之一。在這項工作中,我們將桌上遊戲尤里卡博士作為基準,以鼓勵該領域的進一步發展。遊戲包括解決操縱難題的競賽:在透明管中重新排序彩球,其中解決方案需要 做最短路徑規劃,同時考驗機器人的靈活性和穩定度 。在這項工作中, 我們使用人形機器人THORMANG3 來解決問題,很成功地整合了幾種經典和最先進的技術。我們將彩球狀態變化 表示為圖論的形式並用最短路徑問題將其解決 ,此外還應用計算機視覺結合精確運動來執行操作。在本文中,我們還提出了YOLO(稱為YOLO Dr.Eureka )的客製化實現,並且我們實現了基於全連接神經網絡的逆運動學問題的增量解決方案。我們證明了這個神經網絡在大步長下優於Jacobian反向運動學的方法。同時我們使用 Policy Improvements with Path Integral 一種強化學習的演算法來讓機器人自己學習並優化 精密操作的倒球動作 。
To this data ,manipulation still stands as one of the hardest challenges in robotics. In this thesis we examine the board game Dr. Eureka as a benchmark to promote further development in the field. The game consists of a race to solve a manipulation puzzle: reordering colored balls in transparent tubes, in which the solution requires planning, dexterity and agility. Hence we present a robot that can solve this problem, with successful integration of classical and state of the art computer vision and robot manipulation techniques. We represent the puzzle states as graph and solve it as a shortest path problem, in addition to applying computer visio n combined with precise motions to perform the manipulation. Reside we also present a customized implementation of YOLO (called YOLO Dr. Eureka) and we implement an original neural network based on the incremental solution to the inverse kinematics problem . We
show that this neural network outperforms the inverse of the Jacobian method for large step sizes. W e also use Dynamic Motion Primitives(DMP) and Policy Improvements with Path Integrals ( which is a reinforcement learning algorithm to let the robot learn by itself and optimize the motion of dexterity pouring the ball s from one tube toanother.
[1] D. Goodman and R. Keene, Man Versus Machine: Kasparov Versus Deep Blue . H3 Inc, 1997.
[2] X D. Silver, A.Huang, C.J.Maddison, A.Guez, L.Sifre, G. Van Den Driessche, J.Schrittwieser, I.Antonoglou, V.Panneershelvam, M.Lanctot, et al. ,“Mastering the game of go with deep neural networks and tree search,”nature ,vol. 529, no.7587, p. 484, 2016
[3] OpenAI,“Openai five.”http s://blog.openai.com/openai five/
[4] T. Kroger, B. Finkemeyer, S. Winkelbach, L. O. E ble, S. Molkenstruck, and F. M.Wahl,“A manipulator plays jenga,” IEEE robotics & automation magazine,vol.15,no.3,pp.7984,2008.
[5] L. Calvo Varela, C. V. Regueiro,D.S.Canzobre,and R. Iglesias,“Development of a nao humanoid robot able to play tic tac toe game on a tactile tablet,”in Robot 2015: Second Iberian Robotics Conference , pp. 203 215, Springer, 2016
[6] H.Kitano, M.As ada, Y.Kuniyoshi, I.Noda, and E.Osawa, “Robocup: The robot world cup initiative,”1995.
[7] J. Baltes,K.Y.Tu, S.Sadeghnejad, and J.Anderson, “Hurocup: competition for multi event humanoid robot athletes,”The Knowledge Engineering Review , vol.32, 2017.
[8] LeCun, Yann.“LeNet 5, convolutional neural networks.”, Retrieved 16 November 2013.
[9] S. S. Evangelos A.Theodorou, Jonas Buchli, “A generalized path integralcontrol approach to reinforcement learning.”,2010.
[10] Sven Bock, Roland Klobl, Thomas Hackl, Osw in Aichholzer and Gerald Steinbauer ,“Playing Nine Men’s Morris with the Humanoid Robot Nao”
[11] Ilya Sutskever , Oriol Vinyals Quoc V. Le , Sequence to Sequence Learning with Neural Networks ”,2014
[12] H asim Sak, Andrew Senior, Franc oise Beaufays , Long Short Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling”,2014
[13] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng Yang Fu, Alexander C. Berg ,SSD: Single Shot MultiBox Detector”,2016
[14] Shaoqin g Ren Kaiming He Ross Girshick Jian Sun Faster R CNN: Towards Real Time Object Detection with Region Proposal Networks”,2015
[15] Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick , Mask R CNN ”,2018
[16] J. Lundell, “Dynamic movement primitives and rein forcement learning foradapting a learned skill,”2016
[17] C. M. Bishop, “Pattern recognition and machine learning (information science and statistics)”
[18] W. H. Fleming and H. M. Soner, “Controlled markov processes and viscosity solutions, volume 25. springer science & business media,”2006
[19] E. W. Dijkstra,“A note on two problems in connexion with graphs,”Numerischemathe matik vol. 1, no. 1, pp. 269 271, 1959
[20] G. Bradski,“The OpenCV Library,”Dr. Dobb’s Journal of Software Tools, 2000
[21] J. Redmon, S. Div vala, R. Girshick, and A. Farhadi, “You only look once: Unified, real time object detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2016
[22] M. D. Zeiler and R. Fergus,“Visualizing and understanding convolutional networks,”CoRR, vol. abs/1311.2901, 2013
[23] S. Lloyd,“Least squares quantization in pcm,”IEEE transactions on information theory,vol. 28, no. 2, pp. 129 137, 1982.
[24] M. W. Spong, S. Hutchinson, M. Vidyasagar, et al., Robot modeling and control.2006
[25] F. J. C. Montenegro, R. B. Grando, G. R. Librelotto, and R. d. S. Guerra,“Neural network as an alternative to the jacobian for iterative solution to inverse kinematics,”in 2018 XV Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR (Joo Pessoa, Brazil), IEEE
[26] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,”arXiv preprint arXiv:1502.03167, 2015.