研究生: |
沈峻宇 Shen, Jun-Yu |
---|---|
論文名稱: |
基於YOLO深度學習用於小型漂浮物檢測的新型卷積演算法 A Novel Convolutional Algorithm Based on YOLO Deep Learning for Small Floating Object Detection |
指導教授: |
呂成凱
Lu, Cheng-Kai |
口試委員: |
林書彥
Lin, Shu-Yen 林承鴻 Lin, Cheng-Hung 呂成凱 Lu, Cheng-Kai |
口試日期: | 2023/07/14 |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 72 |
中文關鍵詞: | 物件檢測 、深度學習 、YOLO 、空間金字塔池化 |
英文關鍵詞: | Object Detection, Deep learning, YOLO, Spatial Pyramid Pooling |
研究方法: | 實驗設計法 、 比較研究 |
DOI URL: | http://doi.org/10.6345/NTNU202301131 |
論文種類: | 學術論文 |
相關次數: | 點閱:154 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
海洋中的不當廢棄物已導致全球危機,為了緩解這個問題,要在海洋及河流的廢棄物到達環境負荷上限之前對其進行檢測和清理,本研究提出了一種基於 YOLOv4 的算法來檢測河流中的漂流廢棄物,算法結合了改進後的RegP池化層並添加到空間金字塔中的池化層與減少輸出部分的檢測層,以改進特徵提取並防止丟失重要或微小細節,並且針對微小的物品進行檢測。實驗結果中評估了本研究的方法在 FloW和Pascal VOC資料集上的性能,與現今的最先進的技術相比,結果表明提出的方法具有更好的mAP準確率,具體來說,在FloW上分別提升了7.91%和11.36%,並且也與多個在漂流廢棄物檢測的先進方法進行對比,獲得了最佳的準確率,在Pascal VOC上的實驗證實了本研究的方法在不同尺寸大小的物件上的有效性,最後測試了在WIDER FACE上對小尺寸的人臉進行檢測實驗,在準確率上也有一定的提升。本研究提供了一個有前途的解決方案,有助於檢測和清除河流中的廢棄物。
Improper marine debris in the oceans has led to a global crisis. To address this issue, it is essential to detect and clean up waste in the oceans and rivers before reaching the environmental threshold. In this research, we propose a YOLOv4-based algorithm for detecting drifting marine debris in rivers. The algorithm integrates an improved RegP pooling layer added to the spatial pyramid pooling layer and reduces the detection layer's output to enhance feature extraction and prevent the loss of crucial or small details. The algorithm also focuses on detecting smaller objects. We evaluate the performance of our proposed method on the FloW and Pascal VOC datasets. Compared to state-of-the-art techniques, the results demonstrate that our approach achieves higher mAP accuracy, with an improvement of 7.91% and 11.36% on FloW. Additionally, it outperforms several advanced methods for marine debris detection. Experimental results on the Pascal VOC dataset validate the effectiveness of our approach for objects of varying sizes. Lastly, we conduct experiments on the WIDER FACE dataset to detect small-sized faces, which also show promising improvements in accuracy. This study offers a promising solution for detecting and removing waste in rivers, contributing to addressing the global marine debris crisis.
“100+ Ocean Pollution Statistics & Facts (2020-2021),” Condor Ferries. https://www.condorferries.co.uk/marine-ocean-pollution-statistics-facts (accessed Jun. 20, 2023).
“Dietary and inhalation exposure to nano- and microplastic particles and potential implications for human health.” https://www.who.int/publications-detail-redirect/9789240054608 (accessed Jun. 20, 2023).
“塑膠微粒從哪裡來?研究證實:它已成了水循環的一部分 | 遠見雜誌.” https://www.gvm.com.tw/article/79064 (accessed Jun. 29, 2023).
“【愛河千萬蚊子船】新船買來等報廢 清潔員無奈駕竹筏清汙,” 鏡週刊 Mirror Media, Dec. 24, 2018. https://www.mirrormedia.mg/story/20181224soc011/ (accessed Jul. 06, 2023).
“企業捐贈太陽能電動清潔船 清理打撈愛河面邁向淨零 | 高屏離島 | 地方 | 聯合新聞網.” https://udn.com/news/story/7327/6710994 (accessed Jul. 06, 2023).
ORCA-TECH, “欧卡智能智慧水务解决方案 | 第一期 水面环卫与维护解决方案,” 微信公众平台. http://mp.weixin.qq.com/s?__biz=Mzg2NjY2MDI0NQ==&mid=2247490906&idx=1&sn=e6290fa9d5cb8c3fbd4d6627cb221878&chksm=ce4627a0f931aeb623ac5e4cbe203efacc06caa517d266a3a14a66a2cc83d72509e0337874a8#rd (accessed Jun. 20, 2023).
Y. Cheng et al., “FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2021, pp. 10933-10942. doi: 10.1109/ICCV48922.2021.01077.
“What is Object Detection?” https://www.linkedin.com/pulse/what-object-detection-terala-chittibabu (accessed Jun. 21, 2023).
L. Jiao et al., “A Survey of Deep Learning-based Object Detection,” IEEE Access, vol. 7, pp. 128837-128868, 2019, doi: 10.1109/ACCESS.2019.2939201.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection.” arXiv, May 09, 2016. doi: 10.48550/arXiv.1506.02640.
Liu, W. et al. “SSD: Single Shot MultiBox Detector.” European Conference on Computer Vision (2015).
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation.” arXiv, Oct. 22, 2014. doi: 10.48550/arXiv.1311.2524.
R. Girshick, “Fast R-CNN,” in 2015 IEEE International Conference on Computer Vision (ICCV), Feb. 2015, pp. 1440-1448. doi: 10.1109/ICCV.2015.169.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2015. Accessed: Jun. 24, 2023. [Online]. Available: https://papers.nips.cc/paper_files/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection.” arXiv, Apr. 22, 2020. doi: 10.48550/arXiv.2004.10934.
R. Del Prete, M. D. Graziano and A. Renga, "RetinaNet: A deep learning architecture to achieve a robust wake detector in SAR images," 2021 IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI), Naples, Italy, 2021, pp. 171-176, doi: 10.1109/RTSI50628.2021.9597297.
A. M. Roy, R. Bose, and J. Bhaduri, “A fast accurate fine-grain object detection model based on YOLOv4 deep neural network,” Neural Comput & Applic, vol. 34, no. 5, pp. 3895-3921, Mar. 2022, doi: 10.1007/s00521-021-06651-x.
P. Xu et al., “On-Board Real-Time Ship Detection in HISEA-1 SAR Images Based on CFAR and Lightweight Deep Learning,” Remote Sensing, vol. 13, no. 10, Art. no. 10, Jan. 2021, doi: 10.3390/rs13101995.
K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” 2014, pp. 346-361. doi: 10.1007/978-3-319-10578-9_23.
S. Liu, L. Qi, H. Qin, J. Shi and J. Jia, "Path Aggregation Network for Instance Segmentation," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8759-8768, doi: 10.1109/CVPR.2018.00913.
Z. Zhang and M. R. Sabuncu, “Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels.” arXiv, Nov. 29, 2018. doi: 10.48550/arXiv.1805.07836.
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression.” arXiv, Nov. 19, 2019. doi: 10.48550/arXiv.1911.08287.
W. Yang, D. BO, and L. S. Tong, “TS-YOLO:An efficient YOLO Network for Multi-scale Object Detection,” in 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC), Mar. 2022, pp. 656-660. doi: 10.1109/ITOEC53115.2022.9734458.
Mark Everingham, Luc Gool, Christopher K. Williams, John Winn, and Andrew Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vision 88, 2 (June 2010), 303-338. https://doi.org/10.1007/s11263-009-0275-4
O. Yildirim and U. B. Baloglu, “REGP: A NEW POOLING ALGORITHM FOR DEEP CONVOLUTIONAL NEURAL NETWORKS,” NNW, vol. 29, no. 1, pp. 45-60, 2019, doi: 10.14311/NNW.2019.29.004.
C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “DSSD : Deconvolutional Single Shot Detector.” arXiv, Jan. 23, 2017. doi: 10.48550/arXiv.1701.06659.
J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement.” arXiv, Apr. 08, 2018. doi: 10.48550/arXiv.1804.02767.
Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving into High Quality Object Detection.” arXiv, Dec. 03, 2017. doi: 10.48550/arXiv.1712.00726.
Y. Xie, Y. Dai, Z. Dai, and Z. Jia, “A Multi-Level Semantic Fusion VoteNet for 3D Object Detection on Point Clouds,” in 2021 China Automation Congress (CAC), Oct. 2021, pp. 4514-4519. doi: 10.1109/CAC53003.2021.9728594.
F. Nobis, M. Geisslinger, M. Weber, J. Betz, and M. Lienkamp, “A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection.” arXiv, May 15, 2020. doi: 10.48550/arXiv.2005.07431.
L. Li and Y. Xie, “A Feature Pyramid Fusion Detection Algorithm Based on Radar and Camera Sensor,” in 2020 15th IEEE International Conference on Signal Processing (ICSP), Feb. 2020, pp. 366-370. doi: 10.1109/ICSP48669.2020.9320985.
S. Yang, P. Luo, C.-C. Loy, and X. Tang, “WIDER FACE: A Face Detection Benchmark,” presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5525-5533. Accessed: Jan. 16, 2023. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2016/html/Yang_WIDER_FACE_A_CVPR_2016_paper.html
R. Padilla, S. L. Netto, and E. A. B. da Silva, “A Survey on Performance Metrics for Object-Detection Algorithms,” in 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Jul. 2020, pp. 237-242. doi: 10.1109/IWSSIP48289.2020.9145130.
“[1912.01703] PyTorch: An Imperative Style, High-Performance Deep Learning Library.” https://arxiv.org/abs/1912.01703 (accessed Jun. 29, 2023).
“pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration.” https://github.com/pytorch/pytorch/tree/main (accessed Jun. 29, 2023).