研究生: |
蔡佳諭 Tsai, Chia-Yu |
---|---|
論文名稱: |
基於RISC-V架構之脈動陣列一維卷積運算研究 Implementation of 1-D Convolution in Systolic Array based on RISC-V Architecture |
指導教授: |
黃文吉
Hwang, Wen-Jyi |
口試委員: |
葉佐任
Yeh, Tso-Zen 鮑興國 Pao, Hsing-Kuo 黃文吉 Hwang, Wen-Jyi |
口試日期: | 2022/07/27 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 69 |
中文關鍵詞: | 深度學習加速器 、一維卷積運算 |
英文關鍵詞: | Gemmini, RISC-V, Systolic Array |
研究方法: | 實驗設計法 |
DOI URL: | http://doi.org/10.6345/NTNU202201366 |
論文種類: | 學術論文 |
相關次數: | 點閱:119 下載:19 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現有Edge端裝置由於產品定位原因,多數運算能力不足以應付AI模型應用程式,也因此裝置搭配硬體AI加速器,來使其足夠運算AI模型的方式成為此困境的解決方法之一。
本論文研究基於RISC-V架構下的硬體AI加速器平台Gemmini,透過RISC-V中的custom指令為基礎,設計可利用加速器進行運算的一維卷積運算程式,使得此加速器平台能廣泛應用於類神經網路中。
本論文將設計的程式執行於包含Gemmini平台的FPGA上,以Clock Cycles作為運算速度依據,比較模型運算時使用加速器與否的差別,以及直接使用Gemmini,與重排資料後再使用Gemmini執行一維卷積運算的速度差距,藉由此兩種比較,驗證Gemmini的加速效果及直接使用其運算1-D CNN的可行性。
[1] H. Genc, S. Kim, A. Amid, A. Haj-Ali, V. Iyer, P. Prakash, J. Zhao, D. Grubb, H. Liew, H. Mao, A. Ou, C. Schmidt, S. Steffl, J. Wright, I. Stoica, J. Ragan-Kelley, K. Asanovic, B. Nikolic, Y. Sophia Shao, "Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration." 2021 58th ACM/IEEE Design Automation Conference (DAC), 2021, pp. 769-774, doi: 10.1109/DAC18074.2021.9586216.
[2] G. Zhou, J. Zhou and H. Lin, "Research on NVIDIA Deep Learning Accelerator." 2018 12th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID), 2018, pp. 192-195, doi: 10.1109/ICASID.2018.8693202.
[3] A. Amid, D. Biancolin, A. Gonzalez, D. Grubb, S. Karandikar, H. Liew, A. Magyar, H. Mao, A. Ou, N. Pemberton, P. Rigge, C. Schmidt, J. Wright, J. Zhao, Y. S. Shao, K. Asanović, B. Nikolić, "Chipyard: Integrated Design, Simulation, and Implementation Framework for Custom SoCs." in IEEE Micro, 2020, vol. 40, no. 4, pp. 10-21, doi: 10.1109/MM.2020.2996616.
[4] K. Asanović, R. Avizienis, J. Bachrach, S. Beamer, D. Biancolin, C. Celio, H. Cook, D. Dabbelt, J. Hauser, A. Izraelevitz, S. Karandikar, B. Keller, D. Kim, J. Koenig, Y. Lee, E. Love, M. Maas, A. Magyar, H. Mao, M. Moreto, A. Ou, D. A. Patterson, B. Richards, C. Schmidt, S. Twigg, H. Vo, A. Waterman, "The rocket chip generator." EECS Department, University of California, Berkeley, 2016, Tech. Rep. UCB/EECS-2016-17 4.
[5] Y. Lee, C. Schmidt, A. Ou, A. Waterman, K. Asanović, “The Hwacha vector-fetch architecture manual.” EECS Department, University of California, Berkeley, 2015, Tech. Rep. UCB/EECS-2015-262.
[6] C. Schmidt, A. Izraelevitz, “A fast parameterized sha3 accelerator.” EECS Department, University of California, Berkeley, 2015, Tech. Rep. UCB/EECS-2015-204.
[7] IceNet. Chipyard main documentation, Accessed on July, 20, 2022
https://chipyard.readthedocs.io/en/latest/Generators/IceNet.html
[8] SiFive Generators. Chipyard main documentation, Accessed on July, 20, 2022
https://chipyard.readthedocs.io/en/latest/Generators/SiFive-Generators.html?highlight=sifive%20block#sifive-generators
[9] J. Bachrach, H. Vo, B. Richards, Y. Lee, A. Waterman, R. Avižienis, J. Wawrzynek, K. Asanović, "Chisel: Constructing hardware in a Scala embedded language." DAC Design Automation Conference 2012, 2012, pp. 1212-1221, doi: 10.1145/2228360.2228584.
[10] A. Izraelevitz, J. Koenig, P. Li, R. Lin, A. Wang, A. Magyar, D. Kim, C. Schmidt, C. Markley, J. Lawson, J. Bachrach, "Reusability is FIRRTL ground: Hardware construction languages, compiler frameworks, and transformations." 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017, pp. 209-216, doi: 10.1109/ICCAD.2017.8203780.
[11] N. Pemberton and A. Amid, "FireMarshal: Making HW/SW Co-Design Reproducible and Reliable." 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2021, pp. 299-309, doi: 10.1109/ISPASS51385.2021.00052.
[12] S. Karandikar, H. Mao, D. Kim, D. Biancolin, A. Amid, D. Lee, N. Pemberton, E. Amaro, C. Schmidt, A. Chopra, Q. Huang, K. Kovacs, B. Nikolic, R. Katz, J. Bachrach, K. Asanovic, "FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud." 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), 2018, pp. 29-42, doi: 10.1109/ISCA.2018.00014.
[13] W. Snyder, "Verilator and systemperl. " In: North American SystemC Users’ Group, Design Automation Conference. 2004.
[14] VCS. Synopsys documentation, Accessed on July, 20, 2022
https://www.synopsys.com/verification/simulation/vcs.html
[15] E. Wang, C. Schmidt, A. Izraelevitz, J. Wright, B. Nikolić, E. Alon, J. Bachrach, "A Methodology for Reusable Physical Design." 2020 21st International Symposium on Quality Electronic Design (ISQED), 2020, pp. 243-249, doi: 10.1109/ISQED48828.2020.9136999.
[16] S. Kung, "VLSI Array processors." in IEEE ASSP Magazine, 1985, vol. 2, no. 3, pp. 4-22, doi: 10.1109/MASSP.1985.1163741.
[17] Y. -H. Chen, J. Emer and V. Sze, "Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators." in IEEE Micro, 2017, vol. 37, no. 3, pp. 12-21, doi: 10.1109/MM.2017.54.
[18] T. Ince, S. Kiranyaz, L. Eren, M. Askar and M. Gabbouj, "Real-Time Motor Fault Detection by 1-D Convolutional Neural Networks." in IEEE Transactions on Industrial Electronics, 2016, vol. 63, no. 11, pp. 7067-7075, doi: 10.1109/TIE.2016.2582729.
[19] A. V. Trusov, E. E. Limonova, D. P. Nikolaev and V. V. Arlazarov, "p-im2col: Simple Yet Efficient Convolution Algorithm With Flexibly Controlled Memory Overhead." in IEEE Access, 2021, vol. 9, pp. 168162-168184, doi: 10.1109/ACCESS.2021.3135690.
[20] 鄭博升, "以矩陣乘法為基礎應用硬體加速器於一維卷積計算之研究", 國立臺灣師範大學資訊工程研究所碩士論文, 2022
[21] 黃維熙, "以Chipyard為基礎的SoC設計平台FPGA實現之研究", 國立臺灣師範大學資訊工程研究所碩士論文, 2022
[22] Y. Chu, Y. Jhang, T. Tai, W. Hwang, "Recognition of Hand Gesture Sequences by Accelerometers and Gyroscopes. " Applied Sciences, 2020, vol. 10, no. 18, pp. 6507.
[23] Y. Jhang, Y. Chu, T. Tai, W. Hwang, P. Cheng, C. Lee, "Sensor Based Dynamic Hand Gesture Recognition by PairNet." 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 2019, pp. 994-1001, doi: 10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00174.