研究生: |
鄭翔升 Cheng, Hsiang-Sheng |
---|---|
論文名稱: |
以 RISC-V SoC 為基礎的類神經網路模型部署工具 Neural network model deployment tools for SoC based on RISC-V cores |
指導教授: |
黃文吉
Hwang, Wen-Jyi |
口試委員: |
董一志
Tung, Yi-Chih 葉佐任 Yeh, Tso-Zen 黃文吉 Hwang, Wen-Jyi |
口試日期: | 2024/01/15 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 中文 |
論文頁數: | 51 |
英文關鍵詞: | RISC-V, TinyML, Model deployment |
DOI URL: | http://doi.org/10.6345/NTNU202400185 |
論文種類: | 學術論文 |
相關次數: | 點閱:135 下載:13 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文實做一個適用於 RISC-V SoC 的模型部署工具,將建立模型、量化模型、部署模型功能整合成一套軟體工具,使用自訂義之 Intermedia Representation 將神經網路自動轉換為可以在 SoC 執行的 C 語言,主要目的是簡化 TinyML 系統開發階段的模型部署流程。使用 Genesys2 FPGA 實現 Rocket Core 與 AI accelerator Gemmini 為基礎的 SoC 驗證此工具的部署結果,包括神經網路推理資料流以及效能。
[1] P. Wolinski, J. Arbel, “Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review,” Nov 2023.
doi:10.48550/arXiv.2311.11883.
[2] T. Wang, C. Wang, X. Zhou, H. Chen, “An Overview of FPGA Based Deep Learning Accelerators: Challenges and Opportunities,” Oct 2019.
doi: 10.1109/HPCC/SmartCity/DSS.2019.00229.
[3] Z. Liu, P. N. Whatmough, “Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference,” May 2020.
doi: 10.1109/LCA.2020.2979965.
[4] A. Shahid, M. Mushtaq,” A Survey Comparing Specialized Hardwar And Evolution In TPUs For Neural Networks,” Nov 2020.
doi: 10.1109/INMIC50486.2020.9318136.
[5] H. Genc, S. Kim, A. Amid, “Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration,” 2021 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2021, pp. 769-774.
doi: 10.1109/DAC18074.2021.9586216.
[6] F. Chollet,” Xception: Deep Learning with Depthwise Separable Convolutions,” Apr 2017. doi: 10.1109/CVPR.2017.195.
[7] A. Amid, D. Biancolin, A. Gonzalez, “Chipyard: Integrated Design, Simulation, and Implementation Framework for Custom SoCs,” May 2020.
doi: 10.1109/MM.2020.2996616.
[8] J. Bachrach, H. Vo; B. Richards, Y. Lee,”Chisel: Constructing hardware in a Scala embedded language,” June 2012.
doi: 10.1145/2228360.2228584.
[9] A. Izraelevitz, J. Koenig, P. Li, “Reusability is FIRRTL ground: Hardware construction languages, compiler frameworks, and transformations,” Nov 2017. doi:10.1109/ICCAD.2017.8203780.
[10] K. Asanovi´c, R. Aviˇzienis, J. Bachrach, “The Rocket Chip Generator,” April 2016. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-17.html
[11] A. Rao, “The RoCC Doc v2: An Introdution to the Rocket Custom Coprocessor Interface,” Dec 2016.
https://pdfcoffee.com/rocc-doc-v2-pdf-free.html
[12] S. H. Chua, T. H. Teo, M. A. Tiruye,“Systolic Array Based Convolutional Neural Network Inference on FPGA,” Dec 2022. doi:10.1109/MCSoC 57363.2022.00029.
[13] B. Wang, S. Ma, G. Zhu, “A novel systolic array processor with dynamic dataflows,” Mar 2022. https://doi.org/10.1016/j.vlsi.2022.03.002
[14] H. Yeh, “Processor Elements And Systolic Arrays,” Nov 1985. doi:10.1109/ACSSC.1985.671416.
[15] M. Nagel, M. Fournarakis, “A White Paper on Neural Network Quantization,” Jun 2021. doi:10.48550/arXiv.2106.08295.
[16] M. Nagel, R. A. Amjad, M. Baalen, C. Louizos, “Up or Down? Adaptive Rounding for Post-Training Quantization,” Jun 2020.
doi: 10.48550/arXiv.2004.10568.
[17] Y. Li et al., “BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction,” arXiv, Jul. 25, 2021. doi: 10.48550/arXiv.2102.05426.
[18] R. C. Huang, “Automated Neural Network Design and Deployment Based on AI Hardware Accelerators,” National Taiwan Normal University, July 2023.
[19] J. H. Koo, “A component layout inspection system based on the heat map marking rule applied to Printed Circuit Boards,” National Taiwan Normal University, July 2022.
doi: 10.6345/NTNU202201353.