簡易檢索 / 詳目顯示

研究生: 曾柏恩
Tseng, Po-An
論文名稱: 基於深度學習之變韌鐵電子顯微影像中MA島的輕量化分割模型
A Lightweight Segmentation Model for MA Islands in Bainite Electron Micrographs Based on Deep Learning
指導教授: 方瓊瑤
Fang, Chiung-Yao
口試委員: 陳世旺
Chen, Sei-Wang
黃仲誼
Huang, Chung-I
羅安鈞
Luo, An-Chun
方瓊瑤
Fang, Chiung-Yao
口試日期: 2023/07/17
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 48
中文關鍵詞: 麻田散鐵-沃斯田鐵島變韌鐵深度學習語義分割輕量級模型金相學通道注意力機制
英文關鍵詞: martensite-austenite islands, bainite, deep learning, semantic segmentation, lightweight model, metallography, channel attention mechanism
研究方法: 實驗設計法比較研究觀察研究
DOI URL: http://doi.org/10.6345/NTNU202301346
論文種類: 學術論文
相關次數: 點閱:100下載:16
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究提出一個基於深度學習之變韌鐵電子顯微影像中 MA 島的輕量化分割模型,用於在變韌鐵顯微影像中分割麻田散鐵-沃斯田鐵(MA)島。MA 島在預測變韌鐵的抗衝擊性能中起著關鍵作用。傳統的 MA 島評估依賴專家主觀的意見,限制了評估結果的準確性和一致性。因此本研究透過訓練 MA 島分割模型時時融入不同專家的意見,來實現更客觀和穩定的分割結果。而在實際應用中,處理大量高分辨率電子顯微鏡圖像需要的大量計算資源。因此,本研究將如何輕量化模型作為一個重要的研究方向。
    本系統利用骨幹網路 (backbone network) 從變韌鐵顯微影像中提取相關特徵,接著使用頭部網路 (head network) 進行 MA 島分割。本研究探索兩種不同的骨幹網路,HRNet 和 Lite-HRNet,並進行輕量化的改良以減少模型的複雜性和提高效率。另外對於兩種骨幹網路本研究皆使用 OCRNet 作為頭部網路。對於 HRNet 的輕量化,通過替換 HRNet 的基本構建塊-ResNet 塊為 ConvNeXt 塊,並刪除某些逐點 (pointwise) 卷積層來輕量化 HRNet,最後使用 ECA 模組增強 HRNet 的性能。在 Lite-HRNet 中,通過用 ECA 模組替換 SW 和 CRW 模組,來降低 Lite-HRNet 的參數量以及計算複雜度。
    實驗結果顯示改良後的 HRNet 與 OCRNet 的組合相較於原始的 HRNet,參數數量和 FLOPs 分別減少 64% 和 27%,同時 MA 島 IoU 達到 78.14%。改良後的 Lite-HRNet 與 OCRNet 的組合相較於原始的 Lite-HRNet,參數數量和 FLOPs 分別減少 4.6% 和 11.66%,同時保持 MA 島 IoU 為 78.09%。由實驗結果發現,本研究所提出的改良方案,在輕量化的同時仍能保持變韌鐵電子顯微影像中 MA 島的分割準確性。

    This study presents a deep-learning-based system for segmenting martensite–austenite (MA) islands in bainite electron micrographs. MA islands play a crucial role in predicting the impact resistance of bainite, but conventional assessment of them relies on subjective expert knowledge, leading to limitations in accuracy and consistency. To overcome this, the proposed system integrates expert opinions in order to achieve a more objective and stable segmentation result. However, in practical applications, a significant amount of computational resources is required to process the large number of high-resolution micrographs. Therefore, this study considers developing a lightweight model to be an important research direction.
    The proposed system utilizes a backbone network to extract relevant features from bainite micrographs, followed by a head network for MA island segmentation. This study explores two different backbone networks, High-Resolution Net (HRNet) and Lite-HRNet, and makes improvements to reduce model complexity and increase efficiency. In addition, OCRNet is selected as the head network for both backbone networks. To develop a lightweight of HRNet, its fundamental building blocks are replaced with ConvNeXt blocks and certain pointwise-convolution layers are removed to reduce the model’s complexity. Additionally, efficient channel attention (ECA) modules are used to enhance the performance of HRNet. In Lite-HRNet, the spatial weight and cross-resolution weight (CRW) modules are replaced with ECA modules to reduce the number of parameters and the computational complexity.
    The experimental results demonstrate that the combination of the improved HRNet and OCRNet achieves a reduction of 64% in number of parameters and 27% in floating-point operations (FLOP) compared to the original HRNet, while achieving an intersection over union (IoU) for MA islands of 78.14%. Similarly, the combination of the improved Lite-HRNet and OCRNet achieves a reduction of 4.6% in number of parameters and 11.66% in FLOPs compared to the original Lite-HRNet, while maintaining an MA island IoU of 78.09%. From the experimental results, it can be observed that the proposed improvement successfully achieves segmentation accuracy while reducing computational complexity.

    1 Introduction 1 1.1 Research Motivation 1 1.2 Research Background and Difficulty 3 1.3 Research Contribution 4 1.4 Thesis Framework 4 2 Related Work 5 2.1 MA islands 5 2.2 Segmentation of MA islands 6 2.3 Semantic Segmentation Neural Networks 7 2.4 Backbone Networks 8 2.4.1 HRNet 8 2.4.2 Lite-HRNet 9 2.4.3 Vision Transformer 10 2.4.4 BEiT 11 2.4.5 ConvNeXt 12 2.5 Head Networks 13 2.5.1 Object-Contextual Representation 13 2.5.2 MaskFormer 14 2.5.3 Mask2Former 15 2.6 Summary 16 3 Research Method 17 3.1 System Overview 17 3.2 Lightweight Backbone Network 18 3.2.1 Lightweight HRNet 19 3.2.2 Lightweight Lite-HRNet 27 3.3 Summary 31 4 Experimental Results 32 4.1 Research Environment and Equipment Setup 32 4.2 Bainite Micrograph Dataset 32 4.3 Macro-level architecture design of the backbone network 32 4.4 Lightweight improvement analysis 33 4.4.1 HRNet lightweight improvement analysis 35 4.4.2 Lite-HRNet lightweight improvement analysis 38 4.5 Comparison of Lightweight Methods for HRNet and Lite-HRNet 41 5 Conclusions and Future Works 43 5.1 Conclusion 43 5.2 Future Works 44 References 45

    World Steel Association. “World Steel in Figures 2022.” (2022), [Online]. Available: https://worldsteel.org/steel-topics/statistics/world-steel-in-figures-2022/. (Feb. 28. 2023).
    M. Ackermann, B. Resiak, P. Buessler, B. Michaut, and W. Bleck, “Effect of Molybdenum and Cooling Regime on Microstructural Heterogeneity in Bainitic Steel Wires,” steel research international, vol. 91, no. 11, p. 1 900 663, 2020.
    F. Caballero, H. Roelofs, S. Hasler, C. Capdevila, J. Chao, J. Cornide, and C. Garcia-Mateo, “Influence of Bainite Morphology on Impact Toughness of Continuously Cooled Cementite Free Bainitic Steels,” Materials Science and Technology, vol. 28, no. 1, pp. 95–102, 2012.
    S. Zajac, V. Schwinn, and K. Tacke, “Characterisation and Quantification of Complex Bainitic Microstructures in High and Ultra-high Strength Linepipe Steels,” Materials Science Forum, Trans Tech Publ, vol. 500, 2005, pp. 387–394.
    M. Müller, D. Britz, L. Ulrich, T. Staudt, and F. Mücklich, “Classification of Bainitic Structures Using Textural Parameters and Machine Learning Techniques,” Metals, vol. 10, no. 5, 2020.
    M. Ackermann, D. Iren, S. Wesselmecking, D. Shetty, and U. Krupp, “Automated Segmentation of Martensite-Austenite Islands in Bainitic Steel,” Materials Characterization, vol. 191, p. 112 091, 2022.
    K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
    M. A. Ackermann, “Bainitic TRIP Steels for Controlled Cooled Wire Rod,” Ph.D. dissertation, Universitätsbibliothek der RWTH Aachen, 2020.
    M. Ackermann, B. Resiak, P. Buessler, B. Michaut, J.-C. Hell, S. Richter, J. Gibson, and W. Bleck, “Methods to Classify Bainite in Wire Rod Steel,” steel research international, vol. 92, no. 1, p. 2 000 454, 2021.
    D. Iren, M. Ackermann, J. Gorfer, G. Pujar, S. Wesselmecking, U. Krupp, and S. Bromuri, “Aachen-Heerlen Annotated Steel Microstructure Dataset,” Scientific Data, vol. 8, no. 1, pp. 1–9, 2021.
    K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv preprint arXiv:1409.1556, 2014.
    S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-cnn: Towards Real-Time Object Detection with Region Proposal Networks,” Advances in neural information processing systems, vol. 28, 2015.
    J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-time Object Detection,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788. 45
    J. Long, E. Shelhamer, and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
    O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” Proceedings of the International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234–241.
    H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid Scene Parsing Network,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881–2890.
    L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected Crfs,” arXiv preprint arXiv:1412.7062, 2014.
    L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic Image Segmentation With Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017.
    L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking Atrous Convolution for Semantic Image Segmentation,” arXiv preprint arXiv:1706.05587, 2017.
    J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual Attention Network for Scene Segmentation,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 3141–3149. DOI: 10.1109/CVPR.2019.00326.
    Y. Yuan, X. Chen, X. Chen, and J. Wang, “Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation,” Proceedings of the European Conference on Computer Vision, 2020, pp. 173–190.
    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Proceedings of the European Conference on Computer Vision, 2021.
    H. Bao, L. Dong, and F. Wei, “Beit: Bert Pre-Training of Image Transformers,” arXiv preprint arXiv:2106.08254, 2021.
    Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows,” Proceedings of the IEEE/ CVF international conference on computer vision, 2021, pp. 10 012–10 022.
    K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. 46
    J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu, and B. Xiao, “Deep High-Resolution Representation Learning for Visual Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3349–3364, 2021. DOI: 10.1109/TPAMI.2020.2983686.
    C. Yu, B. Xiao, C. Gao, L. Yuan, L. Zhang, N. Sang, and J. Wang, “Lite-hrnet: A Lightweight High-Resolution Network,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 10 440–10 450.
    X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6848–6856.
    A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv preprint arXiv:1704.04861, 2017.
    J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141.
    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,” Advances in neural information processing systems, vol. 30, 2017.
    J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2018.
    Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A Convnet for the 2020s,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 976–11 986.
    B. Cheng, A. Schwing, and A. Kirillov, “Per-Pixel Classification is Not All You Need for Semantic Segmentation,” Advances in Neural Information Processing Systems, vol. 34, pp. 17 864–17 875, 2021.
    N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, Springer, 2020, pp. 213–229.
    B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention Mask Transformer for Universal Image Segmentation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1290–1299.
    X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” arXiv preprint arXiv:2010.04159, 2020.
    S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” International conference on machine learning, pmlr, 2015, pp. 448–456. 47
    X. Glorot, A. Bordes, and Y. Bengio, “Deep Sparse Rectifier Neural Networks,” Proceedings of the fourteenth international conference on artificial intelligence and statistics, JMLR Workshop and Conference Proceedings, 2011, pp. 315–323.
    J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer Normalization,” arXiv preprint arXiv:1607.06450, 2016.
    D. Hendrycks and K. Gimpel, “Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units,” CoRR, abs/1606.08415, vol. 3, 2016.
    Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “Eca-Net: Efficient Channel Attention for Deep Convolutional Neural Networks,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11 534–11 542.
    M. Lin, Q. Chen, and S. Yan, “Network in Network,” arXiv preprint arXiv:1312.4400, 2013.

    下載圖示
    QR CODE