Volume 14, Issue 2 (9-2017)                   JSDP 2017, 14(2): 141-158 | Back to browse issues page

XML Persian Abstract Print

Associated Professor Shahrood University
Abstract:   (917 Views)

Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from document using morphological operators. Then, extracted document lines are divided into a number of equal size column strips. 
This allows to assume that each segment of line document is not curved. Each extracted document line segment is aligned horizontally. For this purpose, a segment line of document is rotated at different angels and for each rotation horizontal projection is obtained. The rotation angle with maximum peak at the corresponding projection signal is selected to align the line segment, horizontally. In order to estimate the geometrical distortion, for each document line a reference point is extracted from each line segment. These points indicate the position of a document line at starting column of line segments. Using reference points of a document line a polynomial function is fitted to each document line. At the end, geometric distortion for each part of the document is eliminated using a perspective transformation.
This transformation is estimated based on the extracted polynomial function. To increase the stability of the proposed method for short text lines, the curve of adjacent text lines of longer length is used. A post processing stage is required after applying perspective transformation on document patches. Since this transformation is a continuous mapping but it is applied on digital images. To remove this distortion from the result, the consistency of each pixel value with the value of neighboring pixels are considered to correct the value of inconsistence pixels.
The proposed method is implemented on Persian and English databases and has been compared with the existing methods. The results indicate the efficiency and accuracy of the proposed method in elimination of geometric distortions.

Full-Text [PDF 7751 kb]   (414 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2015/08/22 | Accepted: 2017/03/5 | Published: 2017/10/21 | ePublished: 2017/10/21

1. [1] E. bayesteh Tashak,A. Ahmadyfard and H. Khosravi "A two-step method for recognizing Persian handwritten words using adaptive divi-sion of gradient image" , JSDP, vol 12,PP 15-29 ,2015.
2. [2] H. Hasanpour and O. Rostami Ghadi " Image enhancment By Reducing the effect of failure factors on Intensity And reflection of the ima-ge"JSDP, vol 9, PP 12-23,2012.
3. [3] S.KhosraviRad, "Nonlinear distortion correct-ion in Persian documentary images" ,M.S. thesis, Shahrood ut, Shahrood ,Iran ,2012.
4. [4] H. Dehboyd, F. Razazi, Sh. Alirezei "Introduc-ing a new method for reducing image distortion in Persian text images captured by the camera" Sixth Conference of the Machine and Image Processing, Esfehan, Iran,2010.
5. [5] M.Shamgholi," Distortion correction and Image enhancement in Persian Books" M.S. thesis, Shahrood ut, Shahrood ,Iran ,2013.
6. [6] M.A. Tolou Beydokhti and A. Ahmadyfard
7. [7] A. Criminisi, I. Reid, and A. Zisserman, "A Plane Measuring Device," University of Oxfo-rd, 1993.
8. [8] B. Gatos, N. Pratikakis, and K. Ntirogiannis, "Segmentation Based Recovery of Arbitrarily Warped Document Images," in Ninth Internat-ional Conference on Document Analy-sis and Recognition (ICDAR), 2007. [DOI:10.1109/ICDAR.2007.4377063]
9. [9] J. Kanai, T. A. Nartker, S. Rice, and G. Nagy, "Performance metrics for document understand-ing systems," in Proceedings of the Second International Conference on Document Analy-sis and Recognition, 1993, pp. 424-427.
10. [10] H. Khosravi and E. Kabir, "A blackboard approach towards integrated Farsi OCR syst-em," International Journal of Document Analy-sis and Recognition (IJDAR), vol. 12, pp. 21-32, 2009. https://doi.org/10.1007/s10032-009-0087-7 [DOI:10.1007/s10032-009-0079-7]
11. [11] J. Liang, D. DeMenthon, and D. Doermann, "Geometric Rectification of Camera-captured Document Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 591-605, 2008. [DOI:10.1109/TPAMI.2007.70724] [PMID]
12. [12] L. Likforman-Sulem and F. Claudie, "Extract-ing text lines in handwritten documents by perceptual grouping," in Advances in handwrit-ing and drawing: a multidisciplinary approach, paris, 1994, pp. 117-135.
13. [13] L. Likforman-Sulem, A. Hanimyan, and C. Faure, "A Hough based algorithm for extracting text lines in handwritten documents," in IEEE Proceedings of the Third International Confer-ence onDocument Analysis and Recogn-ition, 1995, pp. 774-777. [DOI:10.1109/ICDAR.1995.602017]
14. [14] A. Masalovitch and L. Mestetskiy, "Usage of continuous skeletal image representation for document images de-warping," in Proceedings of International Workshop on Camera-Based Document Analysis and Recognition, Curitiba, 2007, pp. 45-53.
15. [15] J. Mundy and A. Zisserman, Geometric invar-iance in computer vision. Cambridge, MA : MIT press, 1992, vol. 92.
16. [16] W. Niblack, "An introduction to digital image processing," Strandberg Publishing Company, 1985.
17. [17] A. H. Roger and C. R. Johnson, "Topics in matrix analysis," in Matrix analysis. Cambridge university press, 2012.
18. [18] J. Sauvola and M. Pietikainen, "Adaptive document image binarization," Pattern Recogni-tion, vol. 32, no. 2, pp. 225-236, 2000. [DOI:10.1016/S0031-3203(99)00055-2]
19. [19] F. Shafait and M. Breuel, "Document image dewarping contest," in 2nd Int. Workshop on Camera-Based Document Analysis and Recognition, Curitiba, Brazil, 2007, pp. 181-188.
20. [20] M.Shamgholi, M. H. Khosravi, and S. M. Riazi, "Document Image Dewarping Based on Text Line Detection and Surface Modeling," International Journal of Engineering-Transac-tions C: Aspects, vol. 27, no. 12, p. 1855, 2014.
21. [21] Z. Shi and V. Govindaraju, "Line separation for complex document images using fuzzy runleng-th," in IEEE Proceedings in First International Workshop on Document Image Analysis for Libraries, 2006, pp. 306-312.
22. [22] A. Ulges, C. H. Lampert, and T. Breuel, "Document capture using stereo vision," in ACM Proceedings of the 2004 ACM symposium on Document engineering, 2004, pp. 198-200. [DOI:10.1145/1030397.1030434]
23. [23] T. Wada, H. Ukida, and T. Matsuyama, "Shape from shading with interreflections under proximalLight Source-3D Shape Reconstruc-tion of Unfolded Book Surface From a Scanner Image," in IEEE Proceedings in Fifth Interna-tional Conference on Computer, 1995, pp. 66-71. [DOI:10.1109/ICCV.1995.466805] [PMID]
24. [24] K. Y.Wong, R. G. Casey, and F. M. Wahl, "Document analysis system," IBM journal of research and development, vol. 26, no. 6, pp. 647-656, Nov. 1982. [DOI:10.1147/rd.266.0647]
25. [25] OmniPage. [Online]. http://www.nuance.com