研究生: |
陳映如 |
---|---|
論文名稱: |
輔助傳統教學影片視頻分割與索引之研究 Video Scene Segmentation and Indexing for Traditional Instructional Videos |
指導教授: | 李忠謀 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 46 |
中文關鍵詞: | 教學影片 、內容分析 、片段變化偵測 |
英文關鍵詞: | lecture videos, content analysis, shot change detection |
論文種類: | 學術論文 |
相關次數: | 點閱:111 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今的錄影工具發達,教師們可以自行錄製教學影片到自己或學校所提供的教學平台供學生觀看。然而一節課的教學影片相當冗長,學生在複習時可能只對某些片段不了解,但卻要下載完整影片而浪費時間,因此視訊切割的重要性更是相對的增加。此外在一般的傳統授課模式中,學生必須邊聽邊書寫教師在講課時抄寫在黑板上面的內容,當老師寫字速度太快或講課太快時,學生不易專心聽講,且容易抄寫錯誤。若能將教師教學影片錄下,並自動萃取出每段教學影片中在黑板上所書寫之課程內容,不但使學生易於了解老師教學內容的架構,也使學生不容易抄寫出錯誤之筆記,而影響對課程的理解。
本論文提出一套智慧型教學影片輔助系統,在攝影機固定的情況下,使用兩台攝影機拍攝教學影片,並合併兩部攝影機所拍攝的影像內容,使學生可以觀看到清晰且完整的教學內容;利用K-means方法去除黑板以外的資訊如:老師、學生、講桌…等,接著更新每張畫面剩下的黑板區域,避免老師身體遮住黑板內容;本研究採用區域二值化的方法取出字跡,並設計出一套去除雜訊之方法;藉由字跡變化量偵測教學影片適合的分割時機點,當片段偵測完畢,進一步截取最完整字跡內容的畫面做為教學筆記,以利學生搜尋片段影片。本研究經由四組不同環境之教學影片實驗判斷切割點之時機與筆記擷取部分皆有良好的效果。
With the advanced video recording equipments today, teachers can record instructional videos and upload these lecture videos to e-learning platforms for students to watch. However, sometimes students may only not understand some parts of the content but they have to waste their time downloading the entire video. This inconvenience shows the importance of video scene segmentation. In addition, in traditional teaching model, students must listen and transcribe the content on the blackboard at the same time. When the lecturer talks or writes too fast, it may be very difficult for students to focus on the lecture or to transcribe the correct contents. If the lecture content on the blackboard can be recorded and automatically extracted, students are not only able to locate the videos easily, but also able to make fewer transcription mistakes.
The study presents an intelligent assistance system for lecture videos. The study utilized two cameras to film instructional videos and merged the images in both cameras so that students can watch a clear and complete teaching content. K-mean Segmentation was adopted to remove information other than the content on the blackboard, such as teachers, students and the podium. Images of the blackboard were updating to avoid the teacher’s body blocking the cameras. Adaptive threshold was applied to extract the words on the blackboard and a method to reduce the noise was designed. By detecting the number of words on the blackboard, suitable timings for video segmentation could be located and the instructional videos were divided into different parts. Finally, the most complete images for the teaching content were obtained to serve as the lecture notes and to help the students search for the video clips. Performance evaluation on four different environment lecture videos shows that the study is highly effective in detecting shot change point and achieves very low content missing rates on lecture notes.
[1] H. Yang, M. Siebert, P. Luhne, H. Sack, and C. Meinel, "Automatic lecture video indexing using video OCR technology," in Multimedia, 2011 IEEE International Symposium on, 2011, pp. 111-116.
[2] T. Tuna, J. Subhlok, and S. Shah, "Indexing and keyword search to ease navigation in lecture videos," in Applied Imagery Pattern Recognition Workshop, 2011 IEEE, 2011, pp. 1-8.
[3] F. Wang, C.W. Ngo, and T.C. Pong, "Lecture video enhancement and editing by integrating posture, gesture, and text," Multimedia,IEEE Transactions on,2007, vol. 9, pp. 397-409.
[4] M. Wienecke, G. A. Fink, and G. Sagerer, "Towards automatic video-based whiteboard reading," in Document Analysis and Recognition, Proceedings. 2003.Seventh International Conference on, 2003, pp. 87-91.
[5] L.w. He and Z. Zhang, "Real-time whiteboard capture and processing using a video camera for teleconferencing," in Acoustics, Speech, and Signal Processing, 2005.Proceedings.IEEE International Conference on, 2005, pp. ii/1113-ii/1116 Vol. 2.
[6] M. Onishi, M. Izumi, and K. Fukunaga, "Blackboard segmentation using video image of lecture and its applications," in Pattern Recognition, 2000. Proceedings.15th International Conference on, 2000, pp. 615-618.
[7] C. Choudary and T. Liu, "Summarization of visual content in instructional videos," Multimedia, IEEE Transactions on, 2007, vol. 9, pp. 1443-1455.
[8] C. Choudary and T. Liu, "Extracting content from instructional videos by statistical modelling and classification," Pattern Analysis and Applications,2006, vol. 10, pp. 69-81.
[9] T. Liu and C. Choudary, "Content extraction and summarization of instructional videos," in Image Processing,2006 IEEE International Conference on, 2006, pp. 149-152.
[10] D. Comaniciu and P. Meer, "Mean shift: A robust approach toward feature space analysis," Pattern Analysis and Machine Intelligence,IEEE Transactions on,2002, vol. 24, pp. 603-619.
[11] A. S. Imran and F. A. Cheikh, "Blackboard content classification for lecture videos," in Image Processing, 2011 IEEE International Conference on, 2011, pp. 2989-2992.
[12] L. Li, W. Huang, I. Y. Gu, and Q. Tian, "Foreground object detection from videos containing complex background," in Proceedings, Multimedia,11th ACM international conference on, 2003, pp. 2-10.
[13] A. Abutaleb and A. Eloteifi, "Automatic Thresholding of Gray-Level Pictures Using 2-D Entropy," in 31st Annual Technical Symposium, 1988, pp. 29-35.
[14] S. Okuni, S. Tsuruoka, G. P. Rayat, H. Kawanaka, and T. Shinogi, "Video scene segmentation using the state recognition of blackboard for blended learning," in Convergence Information Technology, 2007. International Conference on, 2007, pp. 2437-2442.
[15] A. S. Imran and F. A. Cheikh, "Lecture content classification tool," in Communications Control and Signal Processing, 2012 5th International Symposium on, 2012, pp. 1-6.
[16] A. S. Imran, L. Rahadianti, F. A. Cheikh, and S. Y. Yayilgan, "Semantic tags for lecture videos," in Semantic Computing, 2012 IEEE Sixth International Conference on, 2012, pp. 117-120.
[17] M. Brown and D. G. Lowe, "Automatic panoramic image stitching using invariant features," International Journal of Computer Vision,2007, vol. 74, pp. 59-73.
[18] H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: Speeded up robust features," in Computer Vision–ECCV 2006, ed: Springer, 2006, pp. 404-417.
[19] R. Hartley and A. Zisserman, Multiple view geometry in computer vision vol. 2: Cambridge Univ Press, 2000.
[20] J. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, p. 14.
[21] Lab color space. Available: http://en.wikipedia.org/wiki/Lab_color_space
[22] P. Varano, G. Casciola, and I. Sessione, "Elaborazioni di Immagini con la Libreria OpenCV," pp. 27-32.