研究生: |
簡江恆 Chien, Chiang-Heng |
---|---|
論文名稱: |
視覺型同時定位與建圖系統及其在FPGA上的實現 FPGA-Based Implementation for Visual Simultaneous Localization and Mapping System |
指導教授: |
許陳鑑
Hsu, Chen-Chien 王偉彥 Wang, Wei-Yen |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 147 |
中文關鍵詞: | 視覺型同時定位與建圖 、攝影機旋轉與位移矩陣 、地標更新 、loop closure 、軌跡校正演算法 、One-Sided Hestenes-Jacobi 演算法 、FPGA |
英文關鍵詞: | visual simultaneous localization and mapping, localization, map building, loop closure, trajectory bending, One-Sided Hestenes-Jacobi algorithm, FPGA |
DOI URL: | https://doi.org/10.6345/NTNU202202112 |
論文種類: | 學術論文 |
相關次數: | 點閱:129 下載:12 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文針對機器人同時定位與建圖之問題,提出了一個基於線性模型之視覺型同時定位與建圖(Visual Simultaneous Localization and Mapping, V-SLAM)系統,並設計FPGA硬體加速電路,實現一個低成本、低功耗及高運算效率的系統,讓機器人行走在未知環境的過程中,能即時地建立三維環境地圖,同時估測自己在地圖中的狀態。基於線性模型之V-SLAM系統利用SIFT演算法的優勢偵測影像上的特徵點,並利用特徵點的資訊與key-frame選擇機制避免不必要的運算量,而地標管理則負責濾除不可靠的地標,使得攝影機相對狀態估測演算法能夠穩定地估測相對於前一時刻之旋轉與位移矩陣。為了建立完整的三維特徵地圖,本論文提出之一個線性方程式,讓地標能夠以二次收斂的速度更新其狀態,再藉由定位的線性方程式估測攝影機的絕對狀態。當機器人再次造訪先前看過的景象時,本論文基於線性模型描述先前之影像與當前影像的相似度,並利用離群權重函數濾除離群影像,以正確地偵測loop closure,使得機器人能進一步透過改良型軌跡校正演算法校正每一個攝影機及地標狀態,以提供更精準的定位與建圖結果。另外,基於硬體加速電路平行處理的優勢,本論文將此系統實現在低階的FPGA平台上,以快速地提供機器人的狀態及環境地圖,其中的One-Sided Hestenes-Jacobi演算法便是本論文設計之模組之一,用以實現奇異值分解模組。為了驗證本論文提出之V-SLAM系統,本論文透過軟體模擬實驗、利用RGB-D攝影機在小規模之室內環境的實驗以及利用著名的KITTI資料庫提供雙眼視覺在室外大環境的實驗等,與既有之文獻相互比較,而實驗結果可發現,基於線性模型之V-SLAM系統能夠穩定地提供精準的定位結果,且地標更新演算法也確實能建立較為完整的三維地圖,此外,利用查準率與查全率曲線也可發現,本論文提出的loop closure偵測演算法能正確地偵測loops。此外,在硬體電路之實驗中,本論文利用實際環境的特徵點資訊,加以驗證硬體之效果;從實驗結果可知,相較於一般電腦的運算速度而言,FPGA在定位與建圖分別加速了約350倍與460倍的運算時間,顯示本論文之V-SLAM系統可在低階、低成本、低功耗的平台上達到即時進行同時定位與建圖的效果。
In this paper, a visual simultaneous localization and mapping probem (V-SLAM) is addressed by proposing a V-SLAM system based on linear models. Moreover, to develop a low-cost, low power comsuming, and high computational efficiency of a V-SLAM system, an FPGA-implmentation for the proposed approach is established. The proposed V-SLAM system employs SIFT feature detection and description algorithm to extract features from an image, which are subsequently used to decide whether the input image is a key-frame or not. Furthermore, map management is proposed to filter out unstable landmarks such that relative camera pose estimation can be estimated reliably. To build a consistent 3D map, landmarks are updated using an iterative linear equation which is sublinearly convergent, where the updated landmarks are introduced to estimate absolute camera pose according to a linear model. To detect any potential loop closure, another linear model is designed to describe the similarity between the previous-seen images and the current one so that looped key-frame can be found successfully. If a loop is detected, an improved trajectory bending algorithm is therefore subsequently employed to revise the states of a camera as well as landmarks. Inherited from the superiorities of parallel computation, an FPGA-implementation of the proposed V-SLAM system is developed, where One-Sided Hestenes-Jacobi algorithm is designed to provide singular value decomposition of a matrix. To verify the proposed system, exhausted simulations and experiments are introduced, where indoor small-scale as well as outdoor large-scale environments are provided. The former uses an Xtion RGB-D camera, while the latter is by means of a KITTI public dataset using stereo vision. Compared to the existing methods, the proposed approach shows unprecedent estimations according to experimental results. As for the design of hardware implementations, features from an indoor environment are provided to verify the effectiveness of the system. Experimental results show that the required computational time using FPGA is approximately 350 and 460 times faster than using a normal PC in terms of localization and mapping, respectively.
[1] H.-D. Whyte and T. Bailey, “Simultaneous Localization and Mapping: Part I,” IEEE Robotics and Automation Magazine, Vol. 13, No.2, pp. 99-110, 2006.
[2] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “FastSLAM: A factored solution to the simultaneous localization and mapping problem,” National Conference on Artificial Intelligence, Edmonton, July, 2002, pp. 593-598.
[3] G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard, “A Tutorial on Graph-Based SLAM,” IEEE Intelligent Transportation Systems Magazine, Vol. 2, No. 4, pp. 31-43.
[4] H. Strasdat, J.-M.-M. Montiel, and A.-J. Davison, “Visual SLAM: Why filter?,” International Journal of Image and Vision Computing, Vol. 30, No. 2, pp. 65-77, 2012.
[5] K. Konolige and M. Agrawal, “FrameSLAM: From Bundel Adjustment to Real-Time Visual Mapping,” IEEE Transactions on Robotics, Vol. 24, No. 5, pp. 1066-1077, 2008.
[6] J. Engel, T. Sch ̈ops, and D. Cremers, “LSD-SLAM: Large-Scale Direct Monocular SLAM,” European Conference on Computer Vision (ECCV), Zurich, September, 2014, pp. 834-849.
[7] R. Mur-Artal, J.-M.-M. Montiel, and J.-D. Tards. “ORB-SLAM: A Versatile and Accurate Monocular SLAM System,” IEEE Transactions on Robotics, Vol. 5, No. 31, pp. 1147-1163, 2015.
[8] N. Sünderhauf and P. Protzel, “Towards a Robust Back-End for Pose Graph SLAM,” IEEE International Conference on Robotics & Automation (ICRA), St. Paul, May, 2012, pp. 1254-1261.
[9] G. Younes, D. Asmar, and E. Shammas, “A survey on non-filter-based monocular Visual SLAM systems,” ArXiv e-prints, 2016.
[10] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004.
[11] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” European Conference on Computer Vision (ECCV), Graz, May, 2006, pp. 404-417.
[12] T. Emter and A. Stein, “Simultaneous Localization and Mapping with the Kinect sensor,” German Conference on Robotics, Munich, May, 2012, pp. 1-6.
[13] C. Mei, G. Sibley, M. Cummins, P. Newman and I. Reid, “A Constant Time Efficient Stereo SLAM System,” British Machine Vision Conference, London, September, 2009, pp. 1-11.
[14] A.-J. Davison, I.-D. Reid, N.-D. Molton, and O. Stasse, “MonoSLAM: Real-Time Single Camera SLAM,” IEEE Transactions on Pattern Analysis and Machine Learning, Vol. 29, No. 6, pp. 1052-1067, 2007.
[15] C. Kerl, J. Sturm, and D. Cremers, “Dense Visual SLAM for RGB-D Cameras,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, November, 2013, pp. 2100-2106.
[16] H. Johannsson, M. Kaess, and M. Fallon, “Temporally Scalable Visual SLAM Using a Reduced Pose Graph,” IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, May, 2013, pp. 54-61.
[17] H. Lim, J. Lim, and H.-J. Kim, “Real-Time 6-DOF Monocular Visual SLAM in a Large-Scale Environment,” IEEE International Conference on Robotics & Automation (ICRA), Hong Kong, June, 2014, pp.1532-1539.
[18] G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, November, 2007, pp. 225-234.
[19] K. Schmid, T. Tomic, F. Ruess, H. Hirschmüller, and M. Suppa, “Stereo Vision Based Indoor Outdoor Navigation for Flying Robots,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, November, 2013, pp. 3955-3962.
[20] J. Nikolic, J. Rehder, M. Burri, P. Gohl, S. Leutenegger, P.-T. Furgale, and R. Siegwart, “A Synchronized Visual-Inertial Sensor System with FPGA Pre-Processing for Accurate Real-Time SLAM,” IEEE International Conference on Robotics & Automation (ICRA), Hong Kong, June, 2014, pp.431-437.
[21] http://store.irobot.com/irobot-roomba-780/product.jsp?productId=11305111
[22] http://www.neatorobotics.com/robot-vacuum/botvac/botvac-85/
[23] https://store.irobot.com/default/roomba-vacuuming-robot-vacuum-irobot-roomba-980/R980020.html
[24] C. Dufaza and G. Cambon, “LFSR based deterministic and pseudo-random test pattern generator structures,” European Test Conference, Vol. 199, No. 1, 1991.
[25] X. Wang and H. Zhu, “On the Comparisons of Unit Dual Quaternion and Homogeneous Transformation Matrix,” Advances in Applied Clifford Algebras, Vol. 24, No. 1, pp. 213-229, 2014.
[26] L. Sciavicco and B. Siciliano, Modeling and Control of Robot Manipulators, Springer Science & Business Media, 2012.
[27] L. Kneip, D. Scaramuzza, and R. Siegwart, “A Novel Parametrization of the Perspective Three Point Problem for a Direct Computation of Absolute Camera Position and Orientation,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Crowne Plaza, June, 2011, pp. 2969-2976.
[28] A. Concha and J. Civera, “DPPTAM: Dense Piecewise Planar Tracking and Mapping from a Monocular Sequence,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, September, 2015, pp. 5686-5693.
[29] A. Buyval, I. Afanasyev, and E. Magid, “Comparative analysis of ROS-based Monocular SLAM methods for indoor navigation,” International Conference on Machine Vision, Vienna, March, 2017.
[30] E. Rosten and T. Drummond, “Machine Learning for High-Speed Corner Detection,” European Conference on Computer Vision (ECCV), Graz, May, 2006, pp. 430-443.
[31] H. Strasdat, J.-M.-M. Montiel, and A.-J. Davison, “Scale drift-aware large scale monocular SLAM,” Proceedings of Robotics: Science and Systems, Zaragoza, June, 2010, pp. 73-80.
[32] T. Suzuki and T. Kanada, “Measurement of vehicle motion and orientation using optical flow,” International Conference on Intelligent Transportation Systems, Tokyo, October, 1999, pp. 25-30.
[33] D.-G. Lopez and J.-D. Tardos, “Bags of Binary Words for Fast Place Recognition in Image Sequences,” IEEE Transactions on Robotics, Vol. 28, No. 5, pp. 1188-1197, 2012.
[34] H. Strasdat, J.-M.-M. Montiel, and A.-J. Davison, “Double window optimization for constant time visual SLAM,” IEEE International Conference on Computer Vision, Barcelona, November, 2011, pp. 2352-2359.
[35] R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, “G2o: A general framework for graph optimization,” IEEE International Conference on Robotics and Automation (ICRA), Shanghai, May, 2011, pp. 370-375.
[36] J. Lim, J.-M. Frahm, and M. Pollefeys, “Online environment mapping using metric-topological maps,” International Journal of Robotics Research, Vol. 31, No. 12, pp. 1394-1408, 2012.
[37] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Rhode Island, June, 2012, pp. 3354-3361.
[38] A. Geiger, P. Lenz, C.Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” International Journal of Robotics Research, Vol. 32, No. 11, pp. 1231-1237, 2013.
[39] R. Mur-Artal and J.-D. Tards. “ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras,” arXiv preprint, arXiv:1610.06475, 2016.
[40] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: FastSemi-Direct Monocular Visual Odometry,” IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, June, 2014, pp.15-22.
[41] G. Dubbelman and B. Browning, “COP-SLAM: Closed-Form Online Pose-Chain Optimization for Visual SLAM,” IEEE Transactions on Robotics, Vol. 31, No. 5, pp. 1194-1213, 2015.
[42] G.-H. Lee, F. Fraudrich, and M. Pollefeys, “Robust Pose-Graph Loop-Closures with Expectation-Maximization,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, November, 2013, pp. 556-563.
[43] L. Carlone, A. Censi, and F. Dellaert, “Selecting Good Measurements via l1 Relaxation: a Convex Approach for Robust Estimation over Graphs,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Chicago, September, 2014, pp. 2667-2674.
[44] E. Stumm, C. Mei, S. Lacroix, and M. Chli, “Location Graphs for Visual Place Recognition,” IEEE International Conference on Robotics and Automation (ICRA), Seattle, May, 2015, pp. 5475-5780.
[45] N. Kejriwal, S. Kumar, and T. Shibata, “High performance loop closure detection using bag of word pairs,” Journal of Robotics and Autonomous Systems, Vol. 77, pp. 55-56, 2016.
[46] M. Labbe and F. Michaud, “Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation,” IEEE Transactions on Robotics, Vol. 29, No. 3, pp. 734-745, 2013.
[47] M. Boulekchour and N. Aouf, “Efficient Real-Time Loop Closure Detection Using GMM and Tree Structure,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Chicago, September, 2014, pp. 4944-4949.
[48] N. Sunderhauf and P. Protzel, “Switchable constraints for robust pose graph SLAM,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, October, 2012, pp. 1879-1884.
[49] Y. Latif, C. Cadena, and J. Neira, “Realizing, reversing, recovering: Incremental robust loop closing over time using the iRRR algorithm,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, October, 2012, pp. 4211-4217.
[50] E. Olson and P. Agarwal, “Inference on networks of mixtures for robust robot mapping,” International Journal of Robotics Research, Vol. 32, No. 7, pp. 826-840, 2013.
[51] N. Sunderhauf and P. Protzel, “Switchable Constraints vs. Max-Mixture Models vs. RRR - A Comparison of Three Approaches to Robust Pose Graph SLAM,” IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, May, 2013, pp. 5198-5203.
[52] Y. Latif, G. Huang, J. Leonard, and J. Neira, “An Online Sparsity-Cognizant Loop-Closre Algorithm for Visual Navigation,” Conference on Robots: Science and Systems, Berkeley, July, 2014.
[53] M. Shakeri and H. Zhang, “Online Loop Closure Detection via Dynamic Sparse Representations,” Field and Service Robotics, Vol. 113, 2016, pp. 125-139.
[54] A.-N. Ravari and H.-D. Taghirad, “Loop Closure Detection by Algorithmic Information Theory: Implemented on Range and Camera Image Data,” IEEE Transactions on Cybernetics, Vol. 44, No. 10, pp. 1938-1949, 2014.
[55] P.-K. Babitha, T. Thushara, and M.-P. Dechakka, “FPGA Based N-Bit LFSR to Generate Random Sequence Number,” International Journal of Engineering Research and General Science, Vol. 3, No. 3, pp. 6-10, 2015.
[56] P. Alfke, “Efficient Shift Registers, LFSR Counters, and Long Pseudo-Random Sequence Generators,” Xilinx application note, August, 1995.
[57] J. Demmel and K. Veselić, “Jacobi's method is more accurate than QR,” Journal on Matrix Analysis and Applications, Vol. 13, No. 4, pp. 1204-1245, 1992.
[58] R. Mathias and G.-W. Stewart, “A Block QR Algorithm and the Singular Value Decomposition,” Linear Algebra and Its Applications, Vol. 182, pp. 91-100, 1993.
[59] X. Wang and J. Zambreno, “An FPGA Implementation of the Hestenes-Jacobi Algorithm for Singular Value Decomposition,” IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW), pp. 220-227, 2014.
[60] L.-M. Ledesma-Carrillo, E. Cabal-Yepez, R.-D.-J. Romero-Troncoso, A. Garcia-Perez, R.-A. Osornio-Rios, and T.-D. Carozzi, “Reconfigurable FPGA-Based Unit for Singular Value Decomposition of Large mxn Matrices,” IEEE International Conference on Reconfigurable Computing and FPGAs, Cancun, December, 2012, pp. 345-350.
[61] P.-J. Besl and N.-D. McKay, “A Method for registration of 3-D shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 14, No. 2, pp. 239-256, 1992.
[62] D. Burschka and G.-D. Hager, “V-GPS(SLAM): vision-based inertial system for mobile robots,” IEEE International Conference on Robotics and Automation (ICRA), New Orleans, April, 2004, pp. 409-415.
[63] L. Carlone, R. Tron, K. Daniilidis, and F. Dellaert, “Initialization Techniques for 3D SLAM: a Survey on Rotation Estimation and its Use in Pose Graph Optimization,” IEEE International Conference on Robotics and Automation (ICRA), Seattle, May, 2015, pp. 4597-4604.
[64] D. Martiec and T. Pajdla, “Robust rotation and translation estimation in Multiview reconstruction”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, June, 2007, pp. 1-8.
[65] F. Fooladgar, S. Samavi, S.-M.-R. Soroushmehr, and S. Shirani, “Geometrical Analysis of Localization Error in Stereo Vision Systems,” IEEE Sensors Journal, Vol. 13, No. 11, pp. 4236-4246, 2013.
[66] P. Newman and K. Ho, “SLAM-Loop Closing with Visually Salient Features,” IEEE International Conference on Robotics and Automation (ICRA), Barcelona, April, 2005, pp. 635-642.
[67] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of RGB-D SLAM systems,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, October, 2012, pp. 573-580.
[68] C.-S. Lee, D.-E. Clark, and J. Salvi, “SLAM With Dynamic Targets via Single-Cluster PHD Filtering,” IEEE Journal of Selected Topics in Signal Processing, Vol. 7, No. 3, pp. 543-552.
[69] E. Zamora and W. Yu, “Ellipsoid method for Simultaneous Localization and Mapping,” IEEE Annual Conference on Decision and Control (CDC), Los Angeles, February, 2015, pp. 5334-5339.
[70] G. Grisetti, C. Stachniss, S. Grzonka, and W. Burgard, “A tree parameterization for efficiently computing maximum likelihood maps using gradient descent,” Proceedings of Robotics: Science and Systems, Atlanta, June, 2007, pp. 27-30.
[71] G. Grisetti, R. Kummerle, C. Stachniss, U. Frese, and C. Hertzberg, “Hierarchical optimization on manifolds for online 2D and 3D mapping”, IEEE International Conference on Robotics and Automation (ICRA), Anchorage, May, 2010, pp. 273-278.
[72] 潘偉正,“SIFT影像辨識演算法及其在FPGA之實現”,國立台灣師範大學電機工程學系,碩士論文,2016年07月。
[73] G.-G. Rose, “KISS: A Bit Too Simple,” Cryptography and Communications, pp. 1-15, 2011.
[74] S.-A. Li, C.-C. Hsu, C.-C. Wong, and C.-J. Yu, “Hardware software codesign for particle swarm optimization algorithm,” Journal of Information Science, Vol. 181, No. 20, pp. 4582-4596, 2011.
[75] S. Gratton, A.-S. Lawless, and N.-K. Nichols, “Approximate Gauss-Newton methods for nonlinear least squares problems,”Journal of Optimization, Vol. 18, No. 1, pp. 106-132, 2007.