研究生: |
鄭博升 Cheng, Po-Sheng |
---|---|
論文名稱: |
以矩陣乘法為基礎應用硬體加速器於一維卷積計算之研究 Matrix multiplication based 1-D convolution with hardware accelerator |
指導教授: |
黃文吉
Hwang, Wen-Jyi |
口試委員: |
鮑興國
Pao, Hsing-Kuo 葉佐任 Yeh, Tso-Zen 黃文吉 Hwang, Wen-Jyi |
口試日期: | 2022/07/27 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 51 |
中文關鍵詞: | 矩陣乘法 、卷積計算 、類神經網路 、硬體加速器 、量化 |
英文關鍵詞: | FPGA, Quantization, Systolic Array, Weight Stationary |
研究方法: | 實驗設計法 |
DOI URL: | http://doi.org/10.6345/NTNU202201331 |
論文種類: | 學術論文 |
相關次數: | 點閱:111 下載:8 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著電腦計算能力的提升,人工智慧得以受惠於大量的卷積計算來取得資料的特徵,使電腦可以幫我們處理各種複雜的任務。在提升卷積計算的速度的研究中,以矩陣乘法來實作卷積計算是常見的一種方式。
本論文針對一維的卷積計算,提出一種矩陣排列的方式,將一維卷積計算得以用矩陣乘法來達成,並且進一步的使用通用型硬體加速器,來大幅提升矩陣乘法的計算效能。
將本論文的方法應用於神經網路模型,並佈署在FPGA開發版上,經過實驗的驗證,我們可以精準的產出計算結果,並且加速整體神經網路模型的計算效能。
[1] K. Chellapilla, S. Puri, P. Simard. High Performance Convolutional Neural Networks for Document Processing. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ffinria-00112631f.
[2] A. V. Trusov, E. E. Limonova, D. P. Nikolaev and V. V. Arlazarov, p-im2col: Simple Yet Efficient Convolution Algorithm With Flexibly Controlled Memory Overhead, in IEEE Access, vol. 9, pp. 168162-168184, 2021, doi: 10.1109/ACCESS.2021.3135690.
[3] M. Cho, D. Brand, MEC: Memory-efficient Convolution for Deep Neural Network. arXiv:1706.06873, 2017 .
[4] J. -H. Park, J. -H. Seo, Y. -H. Nho and D. -S. Kwon, Touch Gesture Recognition System based on 1D Convolutional Neural Network with Two Touch Sensor Orientation Settings, 2019 16th International Conference on Ubiquitous Robots (UR), 2019.
[5] A. Samajdar, Y. Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna. SCALE-Sim: Systolic CNN Accelerator Simulator. arXiv:1811.02883, 2018.
[6] V. Sze, Y. Chen, J. Emer, A. Suleiman, Z. Zhang. Hardware for Machine Learning: Challenges and Opportunities. arXiv:1612.07625
[7] Y. -H. Chen, J. Emer and V. Sze, Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators, in IEEE Micro, vol. 37, no. 3, pp. 12-21, 2017, doi: 10.1109/MM.2017.54.
[8] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, D. Kalenichenko. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv:1712.05877, 2017.
[9] H. Genc, S. Kim, A. Amid, A. Haj-Ali, V. Iyer, P. Prakash, J. Zhao, D. Grubb, H. Liew, H. Mao, A. Ou, C. Schmidt, S. Steffl, J. Wright, I. Stoica, J. Ragan-Kelley, K. Asanovic, B. Nikolic, Y. Sophia Shao, Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration. arXiv:1911.09925, 2019
[10] Y. Jhang, Y. Chu, T. Tai, W. Hwang, P. Cheng and C. Lee, Sensor Based Dynamic Hand Gesture Recognition by PairNet, 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 2019, pp. 994-1001, doi: 10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00174.
[11] R. Li, Y. Wang, F. Liang, H. Qin, J. Yan and R. Fan, Fully Quantized Network for Object Detection, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2805-2814, doi: 10.1109/CVPR.2019.00292.
[12] W. -H. Huang, Research on FPGA Implementation of Chipyard-based SoC Design Platform, National Taiwan Normal University, 2022.