簡易檢索 / 詳目顯示

研究生: 李韋廷
論文名稱: 人體姿勢判斷系統應用於投影片控制
指導教授: 李忠謀
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 47
中文關鍵詞: 人體姿勢辨識體感操控SVM分類器
英文關鍵詞: arm gesture recognition, somatosensory control, SVM classifier
論文種類: 學術論文
相關次數: 點閱:127下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 姿勢辨識在電腦視覺領域中,是項越來越重要的議題,舉凡是在監控、安全照護、運動員姿勢分析、自動後製影片等的應用越來越廣,然而近年來更將姿勢辨識提升成體感操控的重要發展,人們只需站在攝影機前就能操控畫面中的物體,就像是超大型的觸控螢幕,而有鑒於老師們在教學過程使用投影片時,無法適時的站在投影布幕前提示重點,必須侷限在講桌前操控電腦,我們提出將姿勢辨識的操控應用在教學過程中最常使用的投影片切換頁控制上。
    一般研究姿勢辨識的議題上常見的方法是利用有限狀態機(Finite State Machine,FSM)最具代表性的就是隱馬爾可夫模型(Hidden Markov Model,HMM),但利用FSM為基礎的辨識方式需要隨時了解觀察物體處於何種狀態中,而容易造成錯誤累積的情況,而本研究先定義指令動作所搭配的操作指令,利用監督式學習(supervised learning)的方式,藉由學習使用者做出不同指令動作的特徵,再以支持向量機(Support Vector Machine,SVM)分類器執行辨識動作,同時為了達到即時辨識的效果,避免一般在處理雜訊時較常使用的膨脹(dilation)和腐蝕(erosion)演算法會造成運算上的負擔,本研究運用網格移動偵測(Grid Motion Detection)的方法,同時避免雜訊的干擾也能偵測物體移動的情況,分類器有更好的辨識效果。

    Recently, gesture recognition is an important and interesting research issue in the area of computer vision. Typical applications include intelligent surveillance systems,security activity analysis, precise analysis of athletic performance, and automatic virtual director, etc. Moreover, a somatosensory control is a newly idea, which is based on gesture recognition techniques. People could control the object in the screen without using any controller just like using a huge touch screen. In view of lectures use slides as presentation interface could affected by the projector and lectures are limited to stay the computer table, we proposed a gesture recognition system apply in presentation control.

    Most of traditional gesture recognition methods use Hidden Markov Model (HMM), which based on the finite state machine, perform well only in the well observation of the object. To remove the restriction, we present a supervised learning method by Support Vector Machine (SVM) in this thesis. The SVM classifier is trained and learned features from users. Moreover, without using dilation and erosion
    algorithm to reduce noise from the input image, we proposed Grid Motion Detection method to improve system performance and also reduce noise affected.

    目錄 I 圖目錄 III 表目錄 IV 第一章 簡介 1 1.1 研究動機 1 1.2 研究目的 2 1.3 研究的範圍與限制 2 1.4 論文內容的安排 3 第二章 文獻探討 4 2.1 人體偵測相關的研究探討 4 2.1.1 移動切割(Motion Segmentation) 4 2.1.2 物體偵測(Object Classification) 7 2.2 人體姿勢表示及估測 8 2.3 人體姿勢行為分析 11 第三章 系統架構與運作流程 14 3.1 系統架構 14 3.2 系統運作流程 15 第四章 人體姿勢擷取 18 4.1 膚色偵測方法 18 4.2 連通元件分析 19 4.3 K-MEANS分群法 20 4.4 軌跡移動 23 4.5 網格移動偵測方法 23 第五章 姿勢判斷 26 5.1 起始動作偵測 26 5.1.1 鄰近方框 26 5.1.2 K-means特徵膚色點分布 27 5.1.3移動量 28 5.1.4 軌跡路徑分析 29 5.2 特徵擷取 29 5.3 支持向量機 30 第六章 實驗結果與和討論 33 6.1 實驗影片資料庫 33 6.2 實驗方法與評估方式 35 6.3 實驗結果 36 第七章 結論與未來研究 42 7.1 結論 42 7.2 未來研究 42 參考文獻 44 圖目錄 圖 2.1 背景相減法。 6 圖 2.2 連續影像相減,設閥值做篩選。 6 圖 2.3 人體姿勢3-D模型圖。 9 圖 2.4 HASSAN FOROOSH等人提出將人體分做11個節點表示[27] 10 圖 2.5 觀察被標示的三個節點就足夠作姿勢判斷。 11 圖 2.6 HMM是以FSM(FINITE STATE MACHINE)為基礎的判斷模型 13 圖 2.7 HMM, CRF和HCRF示意圖 13 圖 3.1 系統流程圖 15 圖 4.1 利用RGB膚色過濾 19 圖 4.2 經過連通元件後的膚色影像 20 圖 4.3 K-MEANS初始點 21 圖 4.4 K-MEANS特徵膚色點和連通元件進行配對 22 圖 4.5 軌跡移動圖 23 圖 4.6 形態學示意圖 24 圖 4.7 網格移動偵測(GRID MOTION DETECTION) 25 圖 5.1 鄰近方框 27 圖 5.2 K-MEANS特徵膚色點分布 28 圖 5.3 GRID MOTION DETECTION圖 28 圖 5.4 SVM 將群資料分成兩類 31 圖 5.5 SVM分類示意圖 32 圖 6.1 系統介面 34 圖 6.2 POLYNOMIAL 和 RBF KERNEL FUNCTION的辨識率 37 圖 6.3 網格大小太大時,增加起始動作偵測的困難 38 圖 6.4 特徵維度和收集動作時間關係 40 圖 6.5 階層制辨識和非階層制辨識 41 表目錄 表 6.1 本系統的SVM對於不同資料內容的辨識率 36 表 6.2 不同網格大小所需的執行時間和起始動作偵測率 37 表 6.3 不同網格大小的起始動作偵測率 39

    [1] J. Alon, V. Athitsos, Q. Yuan, and S. Sclaroff, “A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 9, pp. 1685-1699, Sept. 2009.
    [2] Y. Bogomolov, G. Dror, S. Lapchev, E. Rivlin, M. Rudzsky, and I. Tel-Aviv, “Classification of moving targets based on motion and appearance,” British Machine Vision Conference, Norwich, United Kingdom, pp. 429–438, Oct. 2003.
    [3] G. Bradski and J. Davis, “Motion segmentation and pose recognition with motion history gradients,” International Journal of Machine Vision and Application, vol. 13, no. 3, pp. 174–184, July 2002.
    [4] R. Cutler and L. Davis, “Robust real-time periodic motion detection, analysis, and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no.8, pp. 781–796, Aug. 2000.
    [5] A. Efros, A. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” IEEE International Conference on Computer Vision, Nice, France, pp. 726–733, Oct. 2003.
    [6] A. Fathi and G. Mori, “Action recognition by learning mid-level motion features,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference, Anchorage, Alaska, USA, pp. 1–8, June 2008.
    [7] I. Haritaoglu, D. Harwood, and L. Davis, “W (4): Real-time surveillance of people and their activities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no.8, pp. 809–830, Aug. 2000.
    [8] W. Hu, T. Tan, L.Wang, and S.Maybank, “A survey on visual surveillance of object motion and behaviors,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions, vol. 34, no.3, pp. 334–352, Aug. 2004.
    [9] O. Javed and M. Shah, “Tracking and object classification for automated surveillance,” European Conference on Computer Vision, Copenhagen, Denmark, pp. 343–357, May 2002.
    [10] C. Joslin, A. El-Sawah, Q. Chen, and N. Georganas, “Dynamic Gesture Recognition,” Instrumentation and Measurement Technology Conference, Proceedings of the IEEE, Ottawa, Ontario, Canada, pp. 1706-1711, May 2005.
    [11] H. Kaiqi, T. Dacheng, Y. Yuan, L. Xuelong, and T. Tieniu, “View-Independent Behavior Analysis,” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions, vol. 39, no.4, pp. 1028-1035, Aug. 2009.
    [12] J. Lafferty, A. McCallum, and F. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” International Conference on Machine Learning, Williams College, Williamstown, MA, USA, pp. 282-289, June 2001.
    [13] A. Lipton, H. Fujiyoshi, and R. Patil, “Moving target classification and tracking from real-time video,” IEEE Workshop on Application of Computer Vision, Princeton, New Jersey, pp. 8–14, Oct. 1998.
    [14] B. Lo, and S. Velastin, “Automatic congestion detection system for underground platforms,” International Symposium Intelligent Multimedia, Video and Speech, Hong Kong, China, pp. 158-161, May 2001.
    [15] S. Mitra, and T. Acharya, “Gesture Recognition: A Survey,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 37, no3, pp. 311-324, May 2007.
    [16] B.-W. Min, H.-S. Yoon, J. Soh, Y.-M. Yang, and T. Ejima, “Hand gesture recognition using hidden Markov models,” Systems, Man, and Cybernetics, 'Computational Cybernetics and Simulation'., IEEE International Conference on, Florida, USA, pp. 4232-4235, Oct. 1997.
    [17] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-based object detection in images by components,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no4, pp. 349–361, Apr. 2001.
    [18] A. Quattoni, S. Wang, L.-P. Morency, T. Darrell, and M. Collins, “Hidden Conditional Random Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1848-1852, Oct. 2007.
    [19] A. Quattoni, S. Wang, L.-P. Morency, T. Darrell, and D. Demirdjian, “Hidden Conditional Random Fields for Gesture Recognition,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference, New York, USA pp. 1521-1527, June 2006.
    [20] E. Rivlin, M. Rudzsky, R. Goldenberg, U. Bogomolov, and S. Lepchev, “A real-time system for classification of moving objects,” International Conference Pattern Recognition, Quebec, Canada, pp. 688–691, Aug. 2002.
    [21] N. Robertson and I. Reid, “Behavior understanding in video: A combined method,” IEEE Conference on Computer Vision, Beijing, China, pp. 808–815, Oct. 2005.
    [22] M. Rodriguez and M. Shah, “Detecting and segmenting humans in crowded scenes,” IEEE International Conference on Multimedia, Beijing, China, pp. 353– 356, July 2007.
    [23] V. Vezhnevets, V. Sazonov, and A. Andreeva, “A Survey on Pixel-Based Skin Color Detection Techniques Export,” Proceedings of the Graphic Conference, Moscow, Russia, pp.85-92, Sep. 2003.
    [24] P. Viola, M. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” International Conference on Computer Vision, Nice, France, pp. 734– 741, Oct. 2003
    [25] J. Xiaofei, and L. Honghai, “Advances in View-Invariant Human Motion Analysis: A Review,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions, vol. 40, no. 1, pp. 13-24, Jan. 2010.
    [26] Q. Yuan, S. Sclaroff, and V. Athitsos, “Automatic 2D Hand Tracking in Video Sequences,” Application of Computer Vision, Breckenridge, CO, USA pp. 250- 256, Jan. 2005.
    [27] S. Yuping, and H. Foroosh, “View-Invariant Action Recognition from Point Triplets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1898-1905, Oct. 2009.
    [28] Y. Zhang, K. Huang, Y. Huang and T. Tan, “View-Invariant Action Recognition Using Cross Ratios Across Frames,” 16th IEEE International Conference on Image Processing, Cairo, Egypt, pp. 3549-3552, Nov. 2009.
    [29] Q. Zhou and J. Aggarwal, “Tracking and classifying moving objects from video,” IEEE Workshop Performance Evaluation of Tracking and Surveillance, Kauai, Hawaii, USA, pp. 1–8, Dec. 2001.

    下載圖示
    QR CODE