簡易檢索 / 詳目顯示

研究生: 盧建廷
Chien-Ting Lu
論文名稱: 自動化演講錄製系統
Automated Lecture Recording System
指導教授: 陳世旺
Chen, Sei-Wang
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 85
中文關鍵詞: 自動化演講錄製系統專業攝影師Adaboost 人臉檢測器Mean shift演算法審美標準
英文關鍵詞: Automated lecture recording system, Professional photographer, Adaboost face detector, Mean shift algorithm, Esthetic criteria
論文種類: 學術論文
相關次數: 點閱:126下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本論文中,自動畫演講錄製系統,期望能以導演運鏡的方式記錄下整個演講內容。整個系統由兩台PTZ攝像機構成。一台負責追蹤講者,另一台負責拍攝投影布幕畫面。首先,利用Adaboost分類器找出講者臉部所在位置,接著以Mean-shift演算法對講者臉部區域進行追蹤。如果講者使用教具輔助演講,如雷射筆,指揮棒,和手勢,可以偵測出其位置並標示出來。與此同時,另一台拍攝投影布幕的攝影機利用強度閾值,定出投影布幕區塊進行拍攝。基於講者和投影布幕的相關位置,系統自動調整講者的臉部大小,方向,位置,使其符合專業攝影師所預定義的審美標準。

    In this paper, an automatic lecture recording system, which is expected to be able to act as a professional photographer, is developed. The current system consists of two PTZ cameras. One is for tracking the speaker and another one is for localizing the screen. To begin, the system locates the speaker’s face in the input image provided by the first camera using an Adaboost detector. The located speaker’s face is then tracked over the video sequence using a mean shift algorithm. If the speaker uses teaching aids such as laser pens, batons, and hand gestures, they can be detected and marked as well. In the meantime, the screen is targeted at by the second camera simply by intensity threshold and then tracked over the video sequence according to the movements of the camera. Based on the information about the speaker and the screen, the system automatically adjusts the speaker’s size, orientation, as well as position to be present in images according to some predefined esthetic criteria suggested by professional photographers.

    附表目錄                          iii 附圖目錄                          iv 第一章 緒論                      1-1 1.1 研究背景……………………………………………………1-1 1.2 文獻探討……………………………………………………1-6 1.3 論文架構……………………………………………………1-8 第二章 系統架構              2-1 2.1 系統目標……………………………………………………2-1 2.2 系統架構……………………………………………………2-4 2.3 攝影機考量………………………………………………2-5 2.4 系統流程……………………………………………………2-7 第三章 講者偵測 3-1 3.1 講者偵測…………………………………………………3-1 3.2 Attention Cascade ……………………3-1 3.3 矩形特徵…………………………………………………3-2 3.4 Haar特徵…………………………………………………3-2 3.5 積分圖…………………………………………………3-3 3.6 區域積分圖計算………………………………………3-4 3.7 基於Adaboost的分類器選擇方法………3-5 3.8 偵測結果及分析………………………………………3-7 第四章 講者追蹤 4-1 4.1 講者追蹤問題………………………………………………4-1 4.2 顏色 color…………………………………………………4-1 4.3 邊緣 edge…………………………………………………4-2 4.4 平均位移演算法(Mean-shift)基本概念…………………4-2 4.5 核心函數建構..……………………………………………4-4 4.6 樣板影像色彩分布密度函數……………………………4-6 4.7 樣板影像EOH特徵函數…………………………………4-7 4.8 Bhattacharyya Coefficient…………………………………4-10 4.9 平均位移追蹤演算法……………………………………4-11 4.10 平均位移演算法之實作流程……………………………4-14 4.11 追蹤結果及分析…………………………………………4-15 第五章 螢幕定位與教具偵測 5-1 5.1 投影螢幕偵測……………………………………………5-1 5.2 投影螢幕追蹤………………………………………………5-2 5.3 投影片換頁偵測…………………………………………5-9 5.4 教具偵測…………………………………………………5-12 第六章 Camera Action 6-1 6.1 攝影機的拍攝範圍…………………………………………6-1 6.2 講者的設定位置……………………………………………6-1 6.3 攝影機事件控制表…………………………………………6-6 6.4 事件決策樹…………………………………………………6-8 6.5 實驗結果……………………………………………………6-9 第七章 結論 7-1 6.1 結論…………………………………………………………7-1 6.2 未來工作……………………………………………………7-1 參考著作                       參-1

    [Abo 1999]
    ABOWD, G. 1999. Classroom 2000: An experiment with the instrumentation of a living educational environment. IBM Syst.
    J. 38, 4, 508–530.
    [Bia 1998]
    BIANCHI, M. 1998. Autoauditorium: A fully automatic, multi-camera system to televise auditorium presentations. In Proceedingsof the Joint DARPA/NIST Smart Spaces Technology Workshop.
    [Bae 2003]
    BAECKER,R. 2003. Aprincipled design for scalable internet visual communications with richmedia, interactivity and structured archives. In Proceedings of the Centre for Advanced Studies on Collaborative Research.
    [Bia 2004]
    BIANCHI, M. 2004. Automatic video production of lectures using an intelligent and aware environment. In Proceedings of the
    3rd International Conference on Mobile and Ubiquitous Multimedia, 117–123.
    [Cruz 1994]
    CRUZ, G. AND HILL, R. 1994. Capturing and playing multimedia events with streams. In Proceedings of the ACM Multimedia,
    ACM, New York, 193–200.
    [David 1996]
    Declarative Camera Control for Automatic Cinematography (1996)
    by David B. Christianson , Sean E. Anderson , Li-wei He , David H. Salesin , Daniel S. Weld , Michael F.
    [Finn 1977]
    FINN, K., SELLEN, A. J., AND WILBUR, S. 1977. Video-Mediated Communication. Lawrence Erlbaum.
    [Liu 2004] LIU, T. AND KENDER, J. R. 2004. Lecture videos for e-learning: Current research and challenges. In Proceedings of IEEE
    International Workshop on Multimedia Content-based Analysis and Retrieval. IEEE Computer Society Press, Los Alamitos,CA.
    [Liu 2006]
    CONTENT EXTRACTION AND SUMMARIZATION OF INSTRUCTIONAL VIDEOS (2006) by Tiecheng Liu , Chekuri Choudary
    [Muk 1999] MUKHOPADHYAY, S, AND SMITH, B. 1999. Passive capture and structuring of lectures. In Proceedings of the ACM Multimedia,
    ACM, New York, 477–487.
    [Row 2001]
    ROWE, L. A, PLETCHER, P.,HARLEY,D., AND LAWRENCE,
    S, ”BIBS:Alecture webcasting system.” BMRC 2001.
    [Rui 2001]
    Rui, Y., He, L., Gupta, A., and Liu, Q. 2001., “Building an intelligent camera
    management system.” In Proceedings of the ACM Multimedia, 2-11
    [Rui 2004]
    RUI, Y., GUPTA, A., GRUDIN, J., AND HE, L. 2004. Automating lecture capture and broadcast: Technology and videography. ACM Multimed. Syst. J. 10, 1, 3–15.
    [Rit 2006]
    Studying aesthetics in photographic images using a computational approach (2006)
    by Ritendra Datta , Dhiraj Joshi , Jia Li , James Z. Wang
    In Proc. ECCV
    [Oni 2000]
    Blackboard Segmentation Using Video Image of Lecture and Its Applications (2000)
    by Masaki Onishi , Masao Izumi , Kunio Fukunaga
    In Proceedings of 15 th International Conference on Pattern Recognition
    [Scott 1992]
    [Scott 92]D.W. Scott, Multivariate Density Estimation, New York: Wiley, pp.24-26, 1992.
    [Sug 1999]
    [Sug99] Sugata Mukhopadhyay, Brian Smith,”Passive capture and structuring of
    lectures” ACM international conference on Multimedia, 1999
    [Sei 2007]
    Video Scene Segmentation Using the State Recognition of Blackboard for Blended Learning by Seiji Okuni, Shinji Tsuruoka, Glenn P Rayat, Hiroharu Kawanaka, Tsuyoshi Shinogi 2007 International Conference on Convergence Information Technology ICCIT 2007
    [Tang 1993]
    TANG, J. AND ISSACS, E. 1993. Why do users like video? Studies of multimedia-supported collaboration. Comput. Support. Coop.
    Work: Int. J. 1, 3, 163–196.
    [Vio 2001]
    Viola P, Jones M. Robust real-time object detection. In: IEEE ICCV Workshop on Statistical and Computational Theories of Vision. Vancouver, 2001.
    [Yon 2001] Yong Rui, Liwei He, Anoop Gupta, Qiong Liu, “Building an intelligent
    camera management system” International Multimedia Conference; vol. 9 pp:
    2 – 11,2001.
    [Yon 2008]
    Cha Zhang, Yong Rui, Jim Crawford, Li-Wei He, ” An automated end-to-end
    lecture capture and broadcasting system” ACM Transactions on Multimedia
    Computing, Communications, and Applications (TOMCCAP) , vol 4, 2008.
    [Zha 2005]
    Zhang, C., Rui, Y., He, L. Wallick, M, “Hybrid speaker tracking in an
    automated lecture room” IEEE International Conference, pp.4, 2005.

    下載圖示
    QR CODE