研究生: |
劉耿丞 Liu, Keng-Cheng |
---|---|
論文名稱: |
基於卷積神經網路之即時人臉表情辨識 Real-Time Facial Expression Recognition Based on Convolution Neural Network |
指導教授: |
許陳鑑
Hsu, Chen-Chien 王偉彥 Wang, Wei-Yen |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 81 |
中文關鍵詞: | 深度學習 、卷積神經網路 、人臉表情辨識 、影像處理 |
英文關鍵詞: | deep learning, convolution neural network (CNN), facial expression recognition, image processing |
DOI URL: | http://doi.org/10.6345/NTNU201900765 |
論文種類: | 學術論文 |
相關次數: | 點閱:205 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本文提出基於卷積神經網路(Convolution Neural Network, CNN)之即時人臉表情辨識系統,透過所提出之穩定度提升方法,以解決即時人臉表情辨識的不穩定問題。提高人臉表情辨識準確率的方式有許多種,例如:圖片預處理、辨識架構改變等無非都是要讓應用方面的效果更好。本文想解決攝影機在光照等影響下會造成不斷擷取畫面的某些時刻之圖片特徵改變,導致人臉表情在辨識中產生錯誤。由於攝影機的高速擷取影像,圖片與圖片之間時間間隔較小,因此,本文針對於改良LeNet卷積神經網路和Two Stream卷積神經網路架構辨識系統提出不同的方法,前者使用比重平均法,而後者使用統計法,使用提出之方法後對於即時人臉表情辨識整體穩定度及強健性均獲得提升。
This thesis proposes a real-time facial expression recognition system based on Convolution Neural Network (CNN), solving the unstable problem of real-time facial expression recognition based on different convolutional neural network architectures according to different databases. There are many ways to improve the accuracy of facial expression recognition, such as image preprocessing, adjustment of network architecture, etc. The revamp of the training framework and image preprocessing allow better recognition results in applications. One existing problem is that when the camera captures images in high speed, changes in image characteristics may occur at certain moments due to the influence of light and other factors. Such changes inevitably result in incorrect recognition of the human facial expression. As an attempt to solve this problem, this thesis proposes several methods for improving the LeNet convolutional neural network and the Two Stream convolutional neural network architecture recognition system. The former uses the average weighting method, and the latter uses the statistical method. The overall robustness of real-time facial expression recognition is greatly improved by using the proposed method.
[1] J Searle, The Behavioral and Brain Sciences, in Minds Brains and Programs, vol. 3, 1980
[2] https://en.wikipedia.org/wiki/Turing_test
[3] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” in Proc. 25th Int. Conf. Neural Inf. Process. Syst., Lake Tahoe, Nevada, USA, Dec. 2012, pp. 1106-1114.
[5] V. Nair and G. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proc. International Conference on Machine Learning, Haifa, Israel, June 2010, pp. 807-814.
[6] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” J. Machine Learning Res., vol. 15, no. 1 pp. 1929-1958, 2014.
[7] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Represent., San Diego, CA, May, 2015, pp. 1-14.
[8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, Boston, US, June 2015, pp. 1-9.
[9] K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, US, June 2016, pp. 770-778.
[10] F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain.,” in Proc. Psychological Review, vol. 65, no. 6, pp. 386-408, 1958.
[11] P. J. Werbos, Beyond regression: New tools for prediction and analysis in the behavioral sciences, Harvard University, 1974.
[12] https://morvanzhou.github.io/static/results/tensorflow/5_02_1.png
[13] S. Ioffe and C. Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. International Conference on Machine Learning, Lille, France, July 2015, pp. 448-456.
[14] T. Ojala, M. Pietikainen and D. Harwood, “Performance evaluation of texture measures with classification based on Kullback discrimination of distributions,” in Proc. International Conference on Pattern Recognition, Jerusalem, Israel, Oct. 1994, pp. 582-585.
[15] M. Lyons, S. Akamatsu, M. Kamachi and J. Gyoba, “Coding facial expressions with gabor wavelets,” in Proc. International Conference on Automatic Face and Gesture Recognition. Nara, Japan, April 1998, pp. 200-205.
[16] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and I. Matthews, “The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, June 2010, pp. 94-101.
[17] FERA2011 Challenge data.
[18] I. J. Goodfellow et al., “Challenges in representation learning: A report on three machine learning contests,” in Proc. International Conference on Neural Information Processing, Daegu, Korea, Nov. 2013., pp.117-124.
[19] http://cs.anu.edu.au/few/emotiw2014.html.
[20] A. Mollahosseini, B. Hasani and M. H. Mahoor, “AffectNet: A database for facial expression valence and arousal computing in the wild,” IEEE Trans. Affect. Comput., vol. 10, no. 1, pp. 18-31, Aug. 2017.
[21] https://pic.pimg.tw/cvfiasd/1527645963-35498523.png
[22] Real-world Affective Faces (RAF) Database, archived at http://www.whdeng.cn/RAF/model1.html.
[23] http://www.whdeng.cn/RAF/RAF-DB.png.
[24] J. Kim, J. K. Lee and K. M. Lee, “Deeply-recursive convolutional network for image super-resolution,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, US, June 2016, pp. 1637-1645.
[25] V. D. Van, T. Thai and M. Q. Nghiem, “Combining convolution and recursive neural networks for sentiment analysis,” in Proc. 8th Int. Symp. Information and Communication Technology, Nha Trang City, Vietnam, Dec. 2017, pp.151-158.
[26] Y. Tian, T. Kanade and J. Cohn, “Recognizing action units for facial expression analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 97-115, Feb. 2001.
[27] M. Pantic and J. M. Rothkrantz, “Facial action recognition for facial expression analysis from static face images,” IEEE Trans. Systems, Man and Cybernetics, vol. 34, no. 3, pp. 1449-1461, 2004.
[28] M. S. Bartlett, G. Littlewort, M. G. Frank, C. Lainscsek, I. Fasel and J. Movellan, “Fully Automatic Facial Action Recognition in Spontaneous Behavior,” in Proc. IEEE Int'l Conf. Automatic Face and Gesture Recognition, Southampton, UK, April 2006, pp. 223-230.
[29] T. H. H. Zavaschi, L. E. S. Oliveira and A. L. Koerich, “Facial Expression Recognition Using Ensemble of Classifiers,” in Proc. 36th International Conference on Acoustics Speech and Signal Processing, Prague, Czech Republic, May 2011, pp. 1489-1492.
[30] S. Alizadeh and A. Fazel, “Convolutional neural networks for facial expression recognition,” arXiv preprint arXiv:1704.06756, April 2017.
[31] B. Knyazev, R. Shvetsov, N. Efremova and A. Kuharenko, “Convolutional neural networks pretrained on large face recognition datasets for emotion classification from video,” arXiv preprint arXiv:1711.04598, Nov. 2017.
[32] B. Yang, J. Cao, R. Ni and Y. Zhang, “Facial expression recognition using weighted mixture deep neural network based on double-channel facial images,” in Proc. IEEE Access, vol. 6, pp. 4630-4640, Dec. 2017.
[33] S. Li and W. Deng, “Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition,” IEEE Trans. Image Process., vol. 28, no. 1, pp. 356-370, Jan. 2019.