研究生: |
陳勉光 Chen mian guang |
---|---|
論文名稱: |
利用可攜式鏡頭輔助視障者即時辨識公車車號 Helping the Blind to Identify City Bus Numbers with the Portable Digital Camera |
指導教授: |
葉榮木
Yeh, Zong-Mu 蔡俊明 Tsai, Chun-Ming |
學位類別: |
碩士 Master |
系所名稱: |
機電工程學系 Department of Mechatronic Engineering |
論文出版年: | 2010 |
畢業學年度: | 98 |
語文別: | 中文 |
論文頁數: | 78 |
中文關鍵詞: | 區域分割 、相鄰相減 、前景擷取 、字元辨識 、語音播放系統 |
英文關鍵詞: | Segmentation, Frame difference, Foreground Extraction, OCR, MS SAPI 5.1 |
論文種類: | 學術論文 |
相關次數: | 點閱:169 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
視障者搭乘公車時面臨許多困難,其中無法辨識車號是最關鍵的問題。目前解決此問題的方法是請求路人協助,或手持自製車號牌引起公車駕駛注意,但上述方法皆屬被動性,可變因素較大。有鑑於數位影像處理技術的日漸成熟及攝影機硬體成本的降低,本研究基於數位影像處理技術,利用數位相機的鏡頭模擬,輔助視障者即時辨識公車車號,並以其他感官方式發出提示訊息。本研究以主動搜尋、辨識為目標,並提升系統執行速度,即時擷取的車號資訊,以語音或震動等其他感官方式輸出。實驗中以一般大眾普遍使用鏡頭取得影像資訊,克服以往利用固定鏡頭做處理的方式利用,使用數位相機來模擬可攜式鏡頭,在非固定位置及角度的情況下進行公車區域的分割,利用階段式的處理方法提升系統速度,首先以相鄰相減法,快速擷取前景公車畫面,經過公車幾何分析判定車號所在位置,再利用Sobel測邊定位原理後搭配形態學遮罩,將框取的車號圖片做字元切割及辨識,最後藉由OCR辨識系統搭配MS SAPI 5.1做語音播放系統輸出,在公車停靠前辨識其車號並輸出,實驗畫面為停靠區前約70公尺至公車停靠,實驗中停靠影像時間約為5秒,實驗結果顯示在100張連續測試畫面中約有70張可正確框選出公車區域,其中30張可正確抓取公車車號位置做定位及辨識,且系統每秒可處理31張畫面,可達即時,未來可使用多平台執行,實現方便可攜的輔助性工具來幫助視障者。
The visually impaired persons may encounter many difficulties when taking a bus. Among them, recognizing the bus number can be the most challenging task for them. Up to now, the ways to solve this problem are to ask for other passengers' help or make use of a self-made board on which shows the bus number to cause the bus driver’s attention. However, both methods are passive and not reliable. This research applies digital image processing technology, through the medium of the camera of up-to-date 3C products such as mobile phone, PDA etc, to help the visually impaired persons to recognize the bus number by senses other than sight. The study aims to delivering in-time bus information with proactive (automatic) identification, fast response without the harm to the accuracy and other sensible outputs such as vibration and sounds. In this experiment, the algorithms solve the problem of fixed lent and are able to segment the bus image with unfixed positions and angles, and speed up the system by a proposed method. First, the system catches the bus image by Frame difference, and identifies the position of bus number through geometry analysis. Then, uses Sobel mask and a location algorithm to segment the bus numbers and recognizes them by using the Optical Character Recognition (OCR). Finally, the system outputs the correct bus number phonetically through Microsoft Speech Application Interface 5.1 (MS SAPI 5.1) before the bus stops. In the experiment, the video was set to film about 70 meters from the bus station. The length of each film was around 5 seconds. Among 100 frames, about 70 ones could segment the bus images correctly, and over 30 bus numbers could be located correctly. The system processing speed is 31 images per second. In the future, this technology can be applied to multiple media and bring the realization of a more convenient and helpful tool for the visually impaired persons.
[1]
C. G. Rafael, R. E. Woods, and S. L. Eddins, Digital Image Processing Using MATLAB,1st Edition.
[2]
N. Otsu, “A Threshold Selection Method from Gray-Level Histogram", IEEE Trans. Syst., Man, Cybern., vol. 9, pp. 62–66, 1979.
[3]
R. M. Haralick, S. R. Stenberg, and X. Huang, “Image Analysis Using Mathematical Morphology”, IEEE Trans. Pattern Anal, vol. 9, pp. 532–550, 1987.
[4]
A. M. Mustapha﹐M. Hannan, H. Basri, and A. Hussain, “UKM Campus Bus Identification and Monitoring Using RFID and GIS”,IEEE SCOReD 16-18, pp. 101-104, 2009.
[5]
M. Z. H. Noor, I. Ismail, and M. F. Saaid, “Bus Detection Device for the Blind Using RFID Application”, International Colloquium. Signal Processing & Its Applications. 5th. pp. 247-249, 2009.
[6]
W Li and M. Kunt, “Morphological Segmentation Applied to Displaced Frame Difference Coding”, Signal Processing, vol. 38, pp. 45-56, 1994.
[7]
O. Javed, and K. Shafique, and M. Shah, “A Hierarchical Approach to Robust Background Subtraction using Color and Gradient Information”, Workshop on Motion and Video Computing, pp. 22-27, 2002.
[8]
D. Koller, J. Weber, T. Hung, J. Malik, G. Ogasawara, B. Rao, and S. Russel, “Towards Robust Automatic Traffic Scene Analysis in Real-Time”, Computer Vision & Image Processing, vol. 1, pp. 126-131, 1994.
[9]
C. Wren, A. Azarbayejani, T. Darrel, and A. P. Pentland “Pfinder “Real-Time Tracking of Human Body”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, pp. 780-785, 1997.
[10]
C. Stauffer and W. E. L. Grimson, “Adaptive Background Mixture Models for Real-Time Tracking”, Computer Vision and Pattern Recognition, vol. 2, pp. 246-252, 1999.
[11]
J. C. Tai and K. T. Song, “Background Segmentation and Its Application to Traffic Monitoring Using Modified Histogram”, IEEE International Conference on Networking, Sensing and Control, vol. 1, pp. 13-18, 2004.
[12]
A. Elgammal, D. Harwood, and L. S. Davis, “Non-Parametric Model for Background Subtraction”, Computer Vision ECCV 2000, pp. 751-764, 2000.
[13]
M. Ekinci and E. Gedikli, “Silhouette Based Human Motion Detection and Analysis for Real-Time Automated Video Surveillance”, Turkish Journal of Electrical Engineering & Computer Sciences, vol.13, pp. 199-229, 2005.
[14]
J. Enrique, J. Martínez, and J. Mira, “A New Video Segmentation Method of Moving Objects Based on Blob-Level Knowledge”, Pattern Recognition Letters, vol. 29, pp. 272-285, 2008.
[15]
Z. Danian, Z. Yannan, and W. Jiaxin , “An Efficient Method of License Plate Location”, Pattern Recognition, vol. 26, pp. 2431-2438, 2005.
[16]
J. Jianbin, Y. Qixiang, and H. Qingming, “A Configurable Method for Multi-Style License Plate Recognition”, Pattern Recognition, vol. 42, pp. 358-369, 2008.
[17]
P. Xiang, Y. Xiuzi, and Z. Sanyuan, “A Hybrid Method for Robust Car Plate Character Recognition”, Engineering Applications of Artificial Intelligence, vol. 18, pp. 963-972, 2005.
[18]
A. J. Lipton, H. Fujiyoshi, and R. S. Patil, “Moving Target Classification and Tracking from Real-Time Video”, IEEE Workshop on Applications of Computer Vision, pp. 8-14, 1998.
[19]
S. Y. Chien, S. Y. Ma, and L. G. Chen, “Efficient Moving Object Segmentation Algorithm Using Background Registration Technique” , IEEE Trans. Circuits and Systems for Video Technology, vol. 12, pp. 577-586, 2002.
[20]
S. Y. Chien, Y. W. Huang, B. Y. Hsieh, S. Y. Ma, and L. G. Chen, “Fast Video Segmentation Algorithm with Shadow Cancellation, Global Motion Compensation, and Adaptive Threshold Techniques”, IEEE Trans. Multimedia, vol. 6, pp. 732-748, 2004.
[21]
D. C. Tseng﹐C. W. Lin, and C. M. Ling “Motion Object Detection and Tracking Based on Adaptive Background Subtraction”, 18th IPPR Conference on Computer Vision, Graphics and Image Processing., pp. 891-897 2005.
[22]
L. Maddalena and A. Petrosino, “A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications”, IEEE Trans, Image Processing, vol. 17, pp. 1168-1177, 2008.
[23]
Y. Jia and C. S. Zhang, “Front-View Vehicle Detection by Markov Chain Monte Carlo Method”, Pattern Recognition Letters, vol. 42, pp. 313-321, 2009.
[24]
J. Canny, “A Computational Approach to Edge Detection”, IEEE Trans. Pattern Analysis and Machine Intelligenc , vol. 8, pp. 679-698, 1986.
[25]
H. Schneiderman and T. Kanade, “A Statistical Method for 3D Object Detection Applied to Faces and Cars”, Robotics Institute, Carnegie Mellon University.
[26]
H. L. Bai, J. M. Zhu, and C. P. Liu, “A Fast License Plate Extraction Method on Complex Background”, in Proc. IEEE Intelligent Transportation Systems, vol. 2, pp. 985-987, 2003.
[27]
W. G. Zhu, G. J. Hou, and X. Jia, “A Study of Locating Vehicle License Plate Based on Color Feature and Mathematical Morphology” , IEEE International Conference on Signal Processing, vol. 1, pp. 748-751, 2002.
[28]
W. Wu, L. Yuzhi, M. Wang, and Z. H, “Research on Number-Plate Recognition Based on Neural Networks”,IEEE Signal Processing Society Workshop , pp. 529 -538, 2001.
[29]
M. Jianchang and K. M. Mohiuddin, “Improving OCR Performance Using Character Degradation Models and Boosting Algorithm”, Pattern Recognition Letters, vol. 18, pp. 1415-1419, 1997.
[30]
http://msdn.microsoft.com/en-us/library/ee125077(VS.85).aspx#API_Speech _Recognition
[31]
Y. H. Yu and C. C. Chang, “A New Edge Detection Approach Based on Image Context Analysis”, Image and Vision Computing, vol. 24, pp. 1090-1102, 2006.
[32]
B. C. Bidyut and C. Bhabatosh, “The Equivalence of Best Plane Fit Gradient with Robert's, Prewitt's and Sobel's Gradient for Edge Detection and a 4-Neighbour Gradient with Useful Properties”, Signal Processing, vol. 6, pp. 143-151,1984.
[33]
邱建中,「利用時空域分析與背景相減法作視訊移動物偵測」,碩士論文,國立臺灣師範大學機電科技學系,2009。
其他參考資料
COULEUR.ORG,http://www.couleur.org/
台北市政府工務局,http://www.pwb.taipei.gov.tw/
光學文字辨識系統,http://adc.cm.nsysu.edu.tw/ocr/#2
JOCR,http://home.megapass.co.kr/~woosjung/Product_JOCR.html