This paper presented a two step method for offline handwritten Farsi word recognition. In first step, in order to improve the recognition accuracy and speed, an algorithm proposed for initial eliminating lexicon entries unlikely to match the input image. For lexicon reduction, the words of lexicon are clustered using ISOCLUS and Hierarchal clustering algorithm. Clustering is based on the features that describe the shape of word generally. In second step, a new method proposed to extract histogram of gradient image which this showed well the correspondence between different samples of handwritten word images. The gradient feature vectors of input words are compared with gradient feature vectors of candidate words using K nearest neighbor classifications. The recognition result on handwritten words of IRANSHAR dataset showed that the lexicon reduction step and the new method of extracting gradient feature increased recognition accuracy and speed by removing classifier confusion.
Rights and permissions | |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |