研究生: |
蔡曉瑩 Hsiao-Ying Tsai |
---|---|
論文名稱: |
學習導向黑板教學影片結構化之研究 Learning-focused Structuring for Blackboard Lecture Videos |
指導教授: |
李忠謀
Lee, Chung-Mou |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2010 |
畢業學年度: | 98 |
語文別: | 中文 |
論文頁數: | 51 |
中文關鍵詞: | 語意分析 、教學影片分析 、視覺注意力模型 |
英文關鍵詞: | semantic analysis, lecture video analysis, visual attention modeling |
論文種類: | 學術論文 |
相關次數: | 點閱:127 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今的教室仍然經常使用黑板,而以黑板授課的教學影片亦相當普遍,但黑板教學影片的內容分析在多媒體語意分析的領域深具挑戰性但極少被討論。本論文提出一個基於注意力模型的教學影片結構化方法,用以提醒學習者要投注多少注意力在不同時段的教學影片上。本研究分別設計視覺和聽覺注意力模型以分析影片內容,並利用兩種模型的混合結果產生一個以學習為中心的模型。在視覺分析方面,則以講者寫在黑板上的字跡和講者的講述姿態為分析的內容;而在聽覺分析方面,則以將分析講者的講述語音分析為主。
藉由混合多重注意力模型的結果,產生一個以學習為中心的注意力曲線。以學習為中心的注意力值反映出此段黑板教學影片中講者期望學生投注注意力的程度,也反映語意的強度。因此,學習者可以有彈性且結構化地讀取黑板教學影片,去找到他們應該理解的片段。實驗結果呈現提出的方法可以有效地結構化黑板教學影片,也擷取出與以學習為中心注意力值關聯的教學內容。
Since blackboards are the standard in the classrooms and are still used today, blackboard lecture videos are common in the lecture video recordings. However, it has been known that content-based blackboard lecture video analysis is challenging and thereby rarely touched upon in the field of multimedia semantics analysis. In this thesis, we proposed a new structuring method for blackboard lecture videos by estimating the learning focus that learners should pay more attention to. Both visual and aural analysis for blackboard lecture videos are utilized and integrated to develop a learning-focused attention model. As for the visual analysis, the fluctuation of written lecture content on the blackboard and the posture of lecturers are analyzed. On the other hand, the speech of lecturers is used for aural analysis.
Finally, a learning-focused attention curve can be generated by fusing multiple attention models. In a sense, the values of the learning-focused attention reflect the strength of attention or semantics that the learners should pay to the blackboard lecture videos and can be used for indicating the importance of the extracted lecture content at the corresponding time. Therefore, learners can easily access the blackboard lecture video with good flexibility to find what the lecture content they should understand and video frames to watch from the well-structured video.
The experimental results show that the proposed method can effectively structure blackboard lecture videos and extract the lecture content with associated learning-focused attention values.
[1] B. Liang, L. Songyang, G.J.F. Jones and A.F. Smeaton, “A semantic content analysis model for sports video based on perception concepts and finite state machines,” IEEE International Conference on Multimedia and Expo, Beijing, China, pp. 1407-1410, July 2007.
[2] J. Sivic and A. Zisserman , “Video Google: a text retrieval approach to object matching in videos,” IEEE International Conference on Computer Vision, Nice, France, vol. 2, pp.1470 – 1477, Oct. 2003.
[3] M.B. McGrath and J.R. Brown, “Visual learning for science and engineering,” IEEE Computer Graphics and Applications, vol. 25, no. 5, pp. 56 - 63, Sept.-Oct. 2005.
[4] M. Yu-Fei, H. Xian-Sheng, L. Lie and Z. Hong-Jiang, “A generic framework of user attention model and its application in video summarization,” IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 907 - 919, Oct. 2005.
[5] Y. Chen and W.J. Heng, “Automatic synchronization of speech transcript and slides in presentation,” IEEE International Midwest Symposium on Circuits and Systems, Cairo Egypt, December, vol. 2, pp. 568–571. 2003.
[6] F. Wang, C.W. Ngo, and T.C. Pong, “Synchronization of lecture videos and electronic slides by video text analysis,” ACM Multimedia, Berkeley, CA, USA, pp. 315–318, 2003.
[7] T. Liu, R. Hejelsvold, and J.R. Kender, “Analysis and enhancement of videos of electronic slide presentations,” IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, vol. 1, pp. 77–80, 2002.
[8] C.W. Ngo, F. Wang, and T.C. Pong, “Structuring lecture videos for distance learning applications,” International Symposium on Multimedia Software Engineering, Taichung, Taiwan, pp. 215–222, 2003.
[9] L. He, Z. Liu, and Z. Zhang, “Why take notes use the whiteboard capture system,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, pp. 776–779, 2003.
[10] L. He and Z. Zhang, “Real-time whiteboard capture and processing using a video camera for teleconferencing,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania Convention Center/Marriott Hotel Philadelphia, PA, USA , pp. 1113–1116, 2005.
[11] M. Wienecke, G.A. Fink, and G. Sagerer, “Toward automatic video based whiteboard reading,” International journal on document analysis and recognition, vol. 7, no. 2-3, pp. 188–200, 2005.
[12] Z. Zhang and L. He, “Notetaking with a camera: whiteboard scanning and image enhancement,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Quebec, Canada. vol. 3, pp. 533-536, 2004.
[13] M. Onishi, M. Izumi, K. Fukunaga, “Blackboard segmentation using video image of lecture and its applications,” International Conference on Pattern Recognition, Barcelona, Spain, vol. J83-D-1, No.11, pp.1187-1195 , 2000.
[14] S. Okuni, S. Tsuruoka, G.P. Rayat, H. Kawanaka and T. Shinogi, “Video scene segmentation using the state recognition of blackboard for blended learning,” International Conference on Convergence Information Technology, Gyeongju, Republic of Korea , pp. 2437-2442, 2007
[15] C. Choudary and L. Tiecheng, “Summarization of visual content in instructional videos,” IEEE Transactions on Multimedia, vol. 9, No. 7, pp.1443-1454, 2007
[16] L. Tiecheng and C. Choudary, “Content extraction and summarization of instructional videos,” IEEE International Conference on Image Processing, Atlanta, Georgia, USA, pp. 149-152, 2006
[17] C. Chekuri and L. Tiecheng, “Extracting content from instructional videos by statistical modelling and classification,” Pattern Analysis & Applications., vol. 10, No. 2, pp. 69-81 , 2007
[18] D. Comaniciu and P. Meer, “Mean shift: a robust approach towards feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, 2002.
[19] J. Wolfe, “Visual attention,” De Valois KK, editor. Seeing. 2nd ed. San Diego, CA: Academic Press, pp. 335-386, 2000.
[20] I. Laurent and K. Christof, “Computational modelling of visual attention,” Nature Reviews Neuroscience 2, vol. 2, No. 3, pp.194-203, Mar. 2001.
[21] M. Yu-Fei, H. Xian-Sheng, L. Lie and Z. Hong-Jiang, “A generic framework of user attention model and its application in video summarization,” IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 907–919, Oct. 2005.
[22] C. Xing Xie, F. Xin, M. Wei-Ying, Z. Hong-Jiang and Z. He-Qin, “A visual attention model for adapting images on small displays,” Multimedia Systems, vol. 9, No. 4., pp. 353-364, October 2003
[23] L. Ying-Hua, Z. Xiao-Hua, K. Jun and W. Xue-Feng, “A novel content-based image retrieval approach based on attention-driven model,” International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, pp. 510-515, 2007.
[24] H. Zhang, X. Tian and Y. Chen, “A distortion-weighting spatiotemporal visual attention model for video analysis, ” International Congress on Image and Signal Processing, Tianjin, China, pp.1-4, 2009.44
[25] J. B. MacQueen, "Some methods for classification and analysis of multivariate observations". Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281-297, 1967.
[26] S. Ammouri and G.A. Bilodeau, “Face and hands detection and tracking applied to the monitoring of medication intake,” Canadian Conference on Computer and Robot Vision, Windsor, Ontario, Canada, pp. 147-154, 2008.
[27] B. Shahraray, “Scene change detection and content-based sampling of video sequences,” SPIE conference on Digital Video Compression: Algorithms and Technologies, vol. 2419, pp. 2-13, 1995.
[28] L. Lam, S. Lee, and C. Suen, “Thinning methodologies-a comprehensive survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 9, pp. 879, Sept. 1992.
[29] J. Roger, "Audio Signal Processing and Recognition," available at the links for on-line courses at the author's homepage at http://www.cs.nthu.edu.tw/~jang.
[30] R. Dymitr and G. Bogdan, “An overview of classifier fusion methods,” Computing and Information Systems, vol. 7, no. 1, pp. 1-10, 2000.
[31] S. Rasheed*, Daniel W. Stashuk, and M. S. Kamel, “A hybrid classifier fusion approach for motor unit potential classification during EMG signal decomposition,” Biomedical Engineering, “ vol. 54, no. 9, pp. 1715-1721, 2007.
[32] Anne H. Schistad Solberg, “Contextual data fusion applied to forest map revision,” IEEE transactions on geosciences and remote sensing, vol. 37, no. 3, pp. 1234-1243, 1999.
[33] S. Abhijit, C. Huimin, D. G. Danua, T. Kirubarajana and M. Farooqc, “Estimation and decision fusion: a survey,” Neurocomputing, vol. 71 , no. 13-15, pp. 2650-2656, 2008.
[34] V. Kalyan, Y. Weizhong, G. Kai and O. Lisa, “Improving classifier fusion using particle swarm optimization,” IEEE Symposium on Computational Intelligence in Multicriteria Decision Making, Hilton Hawaiian Village Resort Honolulu, HI, USA , pp.128-135, 2007.
[35] S. Davy, L. Edwin and V. B. Hendrik, ”Increasing on-line classification performance using incremental classifier Fusion,” International Conference on Adaptive and Intelligent Systems, Klagenfurt, Austria, pp. 101-107, 2009.
[36] I. K. Ludmila, “Switching Between Selection and Fusion in Combining Classifiers: An Experiment,” IEEE transactions on systems, man, and cybernetics, vol. 32, no. 2, pp.146-156, 2002.