研究生: |
吳家安 WU, Chia-An |
---|---|
論文名稱: |
類神經網路模型應用於食品熱量與營養成份分析 A Neural Network Model for Calorie and Nutrition Analysis based on Food Images |
指導教授: |
方瓊瑤
Fang, Chiung-Yao |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 82 |
中文關鍵詞: | 食物影像辨識 、食物營養分析 、食物熱量分析 、Mask R-CNN 、彩色影像 、影像分割 |
英文關鍵詞: | food image recognition, food nutrition analysis, food calorie analysis, Mask R-CNN, color image, image segmentation |
DOI URL: | http://doi.org/10.6345/NTNU201900395 |
論文種類: | 學術論文 |
相關次數: | 點閱:268 下載:57 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現代人追求生活品質,注重視身體的健康。然而,根據世界衛生組織2018年公布之2016年全球十大死因死亡人數統計中,慢性病死因佔前十大死因的一半。透過良好的飲食習慣能預防慢性疾病與肥胖。瞭解飲食習慣的典型方法,是紀錄三餐並且分析卡路里與營養成份。因此本研究提出一套利用食物影像分析與估算熱量與營養成份的系統,讓使用者快速的瞭解每餐所攝取的熱量與營養,進而達到均衡營養的目的。
系統啟動後會讀入食物影像,將食物影像調整成特定比例後輸入Mask R-CNN。Mask R-CNN首先利用ResNet101-FPN架構擷取低階至高階的食物特徵,再將各階食物特徵皆輸入RPN(Region proposal network)架構獲得影像中食物區塊。使用RoAlign技術固定食物區塊的尺寸後輸入Mask R-CNN head偵測食物種類、食物預測框與食物遮罩。接著系統會利用食物遮罩得到食物在影像中所佔之面積,將其在影像中所佔的像素數量輸入線性迴歸方程式得到食物重量估測。得到食物重量之後,結合衛生福利部與美國農業部之食品營養資料庫,標示出所估測之食物熱量與營養成份。
本研究所辨識的食物類別共有16個,分別為沙拉、水果、吐司、蛋、香腸、雞肉、培根、法式吐司、歐姆蛋、薯餅、鬆餅、火腿、漢堡排、三明治、薯條以及漢堡。結合Ville Cafe Dataset與Food-256 Dataset,共有36013張影像、58013個食物。其中使用1278張影像、6096個食物作為訓練集,686張影像、3680個食物當作測試集。Ville Cafe Dataset與Food-256 Dataset結合之食物辨識正確率為99.86%,IoU為97.17%。
食物重量估算實驗類別為沙拉、水果、吐司、香腸、培根、火腿、漢堡排與薯條等非複合型食物估算重量。其中每類食物分別使用40、40、44、40、41、49、40與40筆資料,共320筆資料做線性迴歸運算。實驗結果中,平均絕對誤差為8.22,平均相對誤差為0.13。
For the past few decays, obesity has become a serious problem in modern life. Obesity associates with many chronic diseases, which are the leading causes of death, including diabetes, heart disease, stroke and cancer. The most effective way to prevent obesity is through food control, i.e., knowing the food ingestion including the nutrient and calorie. To assist in understanding the food ingestion of each meal, this thesis develops a food recognition system that can analyze the food composition based on the provided image. This thesis also proposes a new-collected dataset Ville Cafe Dataset for food recognition.
The system is developed based on a Mask R-CNN network with a postprocessing mechanism. Mask R-CNN is composed by a Mask R-CNN backbone, a RoIAlign layer, and a Mask R-CNN head. The Mask R-CNN backbone first applies a ResNet101-FPN structure to extract different levels of features. These features are then fed to RPN to locate food regions, or Region of Interests (RoIs), in image. RoIAlign layer resizes the RoIs using bilinear interpolation method and fed to the Mask R-CNN head. The Mask R-CNN head then classify the food category, detect food bounding boxes, and food masks. After obtaining the regions and the categories of each kind of food, the system estimates weight of food using a linear regression model. This thesis also proposes a postprocessing mechanism, which modifies the extracted bounding boxes and masks, to provide a better result on both analytics and visualization.
To estimate the calories and nutrients accurately, the system considers dataset provided the Ministry of Health and Welfare and the United States Department of Agriculture (USDA). According to these informations, the system then shows the estimated calories and nutrients based on the computed food weight and the analysis results.
To estimate the effort, this experiment applied two datasets in the experiments, the Food-256 dataset and the Ville Cafe Dataset. The Ville Cafe Dataset contains 16 categories with 35842 images for each category. This experiment first train our model on the training set, which is the mixture of Food 256 and Ville Cafe, to recognize 16 categories of food, including salad, fruit, toast, egg, sausage, chicken cutlet, bacon, french toast, omelette, hash browns, pancake, ham, hamburger, sandwich and french fries. The training set contains 1278 of food images, 6096 of food items. As for testing, there are 686 food images and 3680 food items being used for evaluation. The food recognition accuracy of the mixture of Ville Cafe Dataset and Food-256 Dataset is 99.86%, and the IoU is 97.17%. As for the food weight estimation experiment includes eight categories: salad, fruit, toast, sausage, bacon, ham, hamburger and french fries. Each of the categories uses 40, 40, 44, 40, 41, 49, 40 and 40 data respectively, a total of 320 data, for linear regression model. In the experimental results, the average absolute error is 8.22, and the average relative error is 0.13.
[Ant13] M. Anthimopoulos, J. Dehais, P. Diem, and S. Mougiakakou, “Segmentation and Recognition of Multi-food Meal Images for Carbohydrate Counting,” Proceedings of 2013 IEEE 13th International Conference on Bioinformatics and Bioengineering (BIBE), Greece, pp. 1-4, 2013.
[Ant14] M. M. Anthimopoulos, L. Gianola, L. Scarnato, P. Diem, and S. G. Mougiakakou, “A Food Recognition System for Diabetic Patients Based on an Optimized Bag-of-Features Model,” IEEE Journal of Biomedical and Health Informatics, vol. 18, no. 4, pp. 1261-1271, 2014.
[Bol16] M. Bolaños and P. Radeva, “Simultaneous Food Localization and Recognition,” Proceedings of 2016 23rd International Conference on Computer Vision and Pattern Recognition(CVPR), Mexico, pp. 3140-3145, 2016.
[Che16] T. Chen, S. Lu, and J. Fan, “S-CNN: Subcategory-aware Convolutional Networks for Object Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 10, pp. 2522-2528, 2017.
[Deh15] J. Dehais, M. Anthimopoulos, and M. Anthimopoulos, “Dish Detection and Segmentation for Dietary Assessment on Smartphones,” Proceedings of International Conference on Image Analysis and Processing, Springer, pp. 433-440, 2015.
[Deh16] J. Dehais, M. Anthimopoulos, and S. Mougiakakou, “Food Image Segmentation for Dietary Assessment,” ACM Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, USA, pp. 23-28, 2016.
[Fuk83] K. Fukushima, S. Miyake, and T. Ito, “Neocognitron: A Neural Network Model for a Mechanism of Visual Pattern Recognition,” IEEE Transactions on Systems, Man, & Cybernetics, vol. 13, no. 5, pp. 826-834, 1983.
[Gir15] R. Girshick, “Fast R-CNN,” Proceedings of International Conference on Computer Vision, pp.1440-1448, Chile, 2015.
[Gir16] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolutional Networks for Accurate Object Detection and Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142-158, 2016.
[Har79] D. M. Harris, Guten, and Sharon, “Health-protective behavior: An exploratory study,” Journal of Health and Social Behavior, vol. 20, no. 1, pp. 17-29, 1979.
[He16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Processings of 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, USA, pp. 1-9, 2016.
[He17] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” Proceedings of 2017 IEEE International Conference on Computer Vision(ICCV), Italy, pp. 2980-2988, 2017.
[Hua17] G. Huang, Z. Liu, L. v. d. Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” Processings of 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, USA, pp.2261-2269, 2017.
[Kri12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks” Processings of Neural Information Processing Systems(NIPS), California, pp. 1097-1105, 2012.
[Lec98] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no.11, pp. 2278-2324, 1998.
[Lic92] S. W. Lichtman, K. Pisarska, E. Raynes Berman, M. Pestone, H. Dowling, E. Offenbacher, H. Weisel, S. Heshka, D. E Matthews, and S. B Heymsfield, “Discrepancy between self-reported and actual caloric intake and exercise in obese subjects,” New England Journal of Medicine, vol. 327, no. 27, pp. 1893-1898, 1992.
[Liv04] M. Livingstone, P. Robson, and J. Wallace, “Issues in Dietary Intake Assessment of Children and Adolescents,” Brit. J. Nutrition, vol. 92, no. 2, pp. 213-222, 2004.
[Mar81] L. Marc, “New perspective on the health of Canadians a working document,” Minister of supply and services, 1981.
[Men17] A. Mente, X. Zhang., S. Swaminathan., L. Wei, M. Viswanathan, R. Iqbal, and R. Kumar, “Associations of fats and carbohydrate intake with cardiovascular disease and mortality in 18 countries from five continents (PURE): a prospective cohort study Prospective Urban Rural Epidemiology (PURE) study investigators,” The Lancet, vol. 390, no. 10107, pp. 2050-2062, 2017.
[Pou14] P. Pouladzadeh, S. Shirmohammadi, and R. Al-Maghrabi, “Measuring Calorie and Nutrition from Food Image,” IEEE Transactions on Instrumentation and Measurement, vol. 63, no. 8, pp. 1947-1956, Aug. 2014.
[Red16] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” Processings of 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, USA, pp. 779-788, 2016.
[Ren17] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp.1137-1149, 2017.
[Sim15] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Proceedings of Advances in Neural Information Processing Systems(NIPS), Montréal, Canada, pp. 1-14, 2015.
[Vil12] G. Villalobos, R. Almaghrabi, P. Pouladzadeh, and S. Shirmohammadi, “An Image Processing Approach for Calorie Intake Measurement,” Proceedings of International Symposium Measurements and Applications, Budapest, pp. 1-5, May 2012.
[Zho12] F. Zhou and Y. Lin, “Fine-Grained Image Classification by Exploring Bipartite-Graph Labels,” Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Budapest, pp. 1-5, 2012.
[Dep17] US Department of Health and Human Services, “Dietary guidelines for Americans 2015-2020,” Skyhorse Publishing Inc., 2017.
[TTC18] “The Top 10 Causes of Death,” Available: http://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death, Acessed 2018.
[Hea18] “Healthy Diet,” Available: http://www.who.int/news-room/fact-sheets/
detail/healthy-diet, Acessed 2019.
[Key17] KEYPO大數據關鍵引擎,吃早餐是一件很健康的事!,網路溫度計,2017年,取自https://dailyview.tw/Daily/2017/04/22?page=0pid=8388。
[Kim17] I. Kim, Deep Object Detectors, https://www.slideshare.et/IldooKim/
deep-object-detectors-1-20166, Acessed 2019.
[U17] U.S. Department of Agrivulture,地區食品營養成份資料庫,2017年。取自https://fdc.nal.usda.gov/index.html。
[Who17] World Health Organization, World Health Statistics 2017: Monitoring health for the SDGs, World Health Organization, 2017, http://apps.who.int/iris/bitstream/10665/255336/1/9789241565486-eng.pdf, Acessed 2019.
[Wor17] “World Health Organization, Obesity and overweights,” Available: http://www.who.int/mediacentre/factsheets/fs311/en/, Acessed 2019.
[Zee17] “Symmetry Breakfast,” Available: https://www.symmetrybreakfast.com, Acessed 2019.
[朱18] 朱明珠,WHO:肥胖是一種慢性疾病,減重5%即可降低罹病風險,台灣英文新聞,2018年,取自https://www.taiwannews.com.tw/ch/news/
3476423。
[吳18] 吳炎,7個飲食習慣易致癌,10條建議助防癌,超越新聞網,2018年,取自http://beyondnewsnet. com/20180106/37036/。
[社18] 社區健康局,國健署公布107年最新版「每日飲食指南」提倡均衡飲食更健康,衛生福利部國民健康署,2018年,取自https://www.hpa.gov.tw/Pages/Detail.aspx?nodeid=1405&pid=8388。
[陳14] 陳美瑾,肥胖對健康的危害,高醫醫訊第33卷第12期第12頁,2014年5月,http://www.kmuh.org.tw/ www/kmcj/data/10305/12.htm。
[陳18] 陳雨鑫,台灣人吃早餐習慣「糟透了」 既不營養又油膩,聯合新聞,2018年,取自https://www.hpa.gov.tw/Pages/Detail.aspx?nodeid=1405&
pid=8388。
[黃00] 黃雅文,健康生活方式 Health Life Style,國家教育研究院,2000年。取自http://terms.naer.edu.tw/detail/1309105/。
[董10] 董氏基金會,市售食品營養標示民眾理解程度調查,2010年。取自https://nutri.jtf.org.tw/index.php?idd=10&aid=2&bid=34&cid=537。
[衛17] 行政院衛生福利部,台灣地區食品營養成份資料庫,2017年。取自https://consumer.fda.gov.tw/ Food/TFND.aspx?nodeID=178#。