研究生: |
鍾易昌 Yi-Chang Chung |
---|---|
論文名稱: |
基於Augmented XY-cut之文件影像結構分析 Augmented XY-cut Based Document Layout Structure Analysis |
指導教授: | 李忠謀 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 中文 |
論文頁數: | 44 |
中文關鍵詞: | 文件影像結構分析 |
英文關鍵詞: | Document Image Layout Structure Analysis |
論文種類: | 學術論文 |
相關次數: | 點閱:104 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究從多種類文件中都擁有的排版關係,分析其排版結構,再加上可編寫Rule-Base文件,來達到協助影像分析目的。一般研究方法為了分析文件影像,需對文件影像進行結構分析,好的結構分析結果可以簡化後續理解程序,本研究利用Recursive XY-cut來當作基底,修改成更實用的Augmented XY-Cut分析,Augmented XY-Cut修正Recursive XY-cut只能切到文件欄位的缺點,並加入了空白區塊節點,讓Augmented XY-Cut更符合文件結構,簡化了Rule-Base複雜度。
This study use Recursive XY-cut as a base, modified into a more practical Augmented XY-Cut Analysis. General methods for analysis of document images, document image to be on the structural analysis, structural analysis can be simplified a good follow-up to understand. Augmented XY-Cut Fixed Recursive XY-cut field that can only be cut to the shortcomings of the documents and add the whitespace node, so Augmented XY-Cut more in line with the documents structure, simplifies the complexity of Rule-Base analysis.
[1] Han Wang; Li, S.Z.; Ragupathi, S., "Document segmentation and classification with top-down approach," Knowledge-Based Intelligent Electronic Systems, 1997. KES '97. Proceedings., 1997 First International Conference on , vol.1, no., pp.243-247 vol.1, 27-23 May 1997
[2] Gaceb, D. Eglin, V.; LeBourgeois, F.; Emptoz, H., "Physical Layout Segmentation of Mail Application Dedicated to Automatic Postal Sorting System," Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on , pp.408-414, 16-19 Sept. 2008
[3] Huiying Zhu; Yuexian Zou, "A cross-connected components-based layout analysis algorithm for Chinese business card," Industrial Electronics and Applications, 2008. ICIEA 2008. 3rd IEEE Conference on , pp.2530-2534, 3-5 June 2008
[4] Matrakas, M.D.; Bortolozzi, F., "Segmentation and validation of commercial documents logical structure," Information Technology: Coding and Computing, 2000. Proceedings. International Conference on , pp.242-246, 2000
[5] Akira Amano; Naoki Asada; Masayuki Mukunoki; Masahito Aoyama "Table form document analysis based on the document structure grammar" International Journal of Document Analysis (2006) 8(2): 201–213
[6] R. Cattoni; T. Coianiz, S. Messelodi; C.M. Modena, "Geometric Layout Analysis Techniques for Document Image Understanding: A Review," IRST Technical Report 9703-09, 1998.
[7] G. Nagy; S Seth; S. D. Stoddard. "Document analysis with an expert system." In Proc. Patem Recognition Practice, volume 11, pages 149-159,Amsterdam, 1985.
[8] F. M. Wahl; K. Y. Wong; and R. G. Casey. "Block segmentation and text extraction in mixed textlgraphics images. Computer Vision, Graphics and Image Processing", 20:375-390, 1982.
[9] Papamarkos, N.; Tzortzakis, J.; Gatos, B., "Determination of run-length smoothing values for document segmentation," Electronics, Circuits, and Systems, 1996. ICECS '96., Proceedings of the Third IEEE International Conference on , vol.2, no., pp.684-687 vol.2, 13-16 Oct 1996
[10] Jaekyu Ha; Haralick, R.M.; Phillips, I.T., "Recursive X-Y cut using bounding boxes of connected components," Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on , vol.2, pp.952-955 vol.2, 14-16 Aug 1995
[11] O'Gorman, L., "The document spectrum for page layout analysis," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.15, no.11, pp.1162-1173, Nov 1993
[12] H.S. Baird, "Background Structure in Document Images," Document Image Analysis, H. Bunke, P. Wang, and H.S. Baird, eds.,pp. 17-34, World Scientific, 1994.
[13] Stefan Klink; Thomas Kieninger, "Rule-based document structure understanding with a fuzzy combination of layout and textual features" International Journal on Document Analysis and Recognition, vol.4 pp.18-26, 2001
[14] N. Otsu, "A threshold selection method from gray-level histograms",IEEE Trans. Systems Man Cybernet. 9 (1) (1979) 62–66.
[15] Sarin Watcharabutsarakham, "Page Segmentation for Content Sequence," Signal Processing, International Conference on, 2006
[16] Aniko Simon; Jean-Christophe Pret; and A. Peter Johnson, "A Fast Algorithm for Bottom-Up Document Layout Analysis," Pattern Analysis and Machine Intelligence, IEEE Transactions on ,VOL. 19, NO. 3, MARCH 1997