دوره 17، شماره 4 - ( 12-1399 )                   جلد 17 شماره 4 صفحات 168-155 | برگشت به فهرست نسخه ها


XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

BabaAli B, Rekabdar B. Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model. JSDP 2021; 17 (4) :155-168
URL: http://jsdp.rcisp.ac.ir/article-1-975-fa.html
باباعلی باقر، رکابدار بابک. بازشناسی دست‌نوشته برون‌خط عربی بر مبنای یک رویکرد تلفیقی جدید از مدل مخفی مارکوف و شبکه‌های عصبی ژرف. پردازش علائم و داده‌ها. 1399; 17 (4) :155-168

URL: http://jsdp.rcisp.ac.ir/article-1-975-fa.html


دانشکده ریاضی، آمار و علوم رایانه، دانشگاه تهران
چکیده:   (2604 مشاهده)
مسأله مدل­‌سازی و بازشناسی دست­نوشته شباهت بسیار زیادی به مسأله مدل‌­سازی و بازشناسی گفتار دارد. به همین علت می‌­توان از رویکردهای به‌کار گرفته‌شده برای مسأله بازشناسی گفتار با اندکی تغییر در مراحل ابتدایی آن مانند استخراج ویژگی، برای بازشناسی دست‌­نوشته نیز بهره برد. با گسترش رویکردهای ترکیبی HMM-DNN و استفاده از توابع هدف دنباله‌­ای مانند MMI پیشرفت‌­های قابل توجهی در حوزه بازشناسی گفتار حاصل شده است. این مقاله با استفاده از نرم­‌افزار متن­باز KALDI، که شهرت اصلی آن در حوزه بازشناسی گفتار و همچنین به‌کارگیری آخرین مدل­‌های ترکیبی ارائه‌شده در آن، به‌کمک روش افزایش داده مدلی برای بازشناسی دست‌­نوشته عربی ارائه داده است. این پژوهش بر روی دادگان KHATT انجام شده که نرخ خطای بازشناسی واژه را بر روی این دادگان به میزان 32/7 درصد مطلق کاهش داده است.
متن کامل [PDF 4346 kb]   (515 دریافت)    
نوع مطالعه: كاربردي | موضوع مقاله: مقالات پردازش متن
دریافت: 1397/11/28 | پذیرش: 1398/6/11 | انتشار: 1399/12/4 | انتشار الکترونیک: 1399/12/4

فهرست منابع
1. [1] R. Ahmad, S. Naz, M. Z. Afzal, S. F. Rashid, M. Liwicki, A. Dengel, "KHATT: A Deep Learning Benchmark on Arabic Script.", In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 10-14, 2017. [DOI:10.1109/ICDAR.2017.358]
2. [2] R. Al-Hajj, C. Mokbel, C. Mokbel, L. Likforman-Sulem and L. Likforman-Sulem, "Combination of HMM-Based Classifiers for the Recognition of Arabic Handwritten Words", Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007. [DOI:10.1109/ICDAR.2007.4377057]
3. [3] J. AlKhateeb, J. Ren, J. Jiang and H. Al-Muhtaseb, "Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking", Pattern Recognition Letters, vol. 32, no. 8, pp. 1081-1088, 2011. [DOI:10.1016/j.patrec.2011.02.006]
4. [4] L. Bahl, P. Brown, P. de Souza and R. Mercer, "Maximum mutual information estimation of hidden Markov model parameters for speech recognition", in International Conference on Acoustics, Speech, and Signal Processing, vol. 11, pp. 49-52, 1986.
5. [5] A. M. Bidgoli, M. Sarhadi, "IAUT/PHCN: Islamic Azad University of Tehran/Persian handwritten city names, a very large database of handwritten Persian word.'', 11th International Conference on Frontiers in Handwriting Recognition, pp. 192-197, 2008.
6. [6] A. Broumandnia, J. Shanbehzadeh and M. Rezakhah Varnoosfaderani, "Persian/arabic handwritten word recognition using M-band packet wavelet transform", Image and Vision Computing, vol. 26, no. 6, pp. 829-842, 2008. [DOI:10.1016/j.imavis.2007.09.004]
7. [7] G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large- vocabulary speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-42, 2012. [DOI:10.1109/TASL.2011.2134090]
8. [8] M. Dehghan, K. Faez, M. Ahmadi and M. Shridhar, "Unconstrained Farsi handwritten word recognition using fuzzy vector quantization and hidden Markov models", Pattern Recognition Letters, vol. 22, no. 2, pp. 209-214, 2001. [DOI:10.1016/S0167-8655(00)00090-8]
9. [9] M. Dehghan, K. Faez, M. Ahmadi, M. Shridhar, "Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM," Pattern Recognition, vol. 34, no. 5, pp. 1057-1065, 2001. [DOI:10.1016/S0031-3203(00)00051-0]
10. [10] R. Duda and P. Hart, "Use of the Hough transformation to detect lines and curves in pictures", Communications of the ACM, vol. 15, no. 1, pp. 11-15, 1972. [DOI:10.1145/361237.361242]
11. [11] A. Elbaati, H. Boubaker, M. Kherallah, A. Ennaji, H. Abed and A. Alimi, "Arabic Handwriting Recognition Using Restored Stroke Chronology", 2009 10th International Conference on Document Analysis and Recognition, 2009. [DOI:10.1109/ICDAR.2009.262]
12. [12] V. Goel and W. Byrne, "Minimum Bayes-risk automatic speech recognition", Computer Speech & Language, vol. 14, no. 2, pp. 115-135, 2000. [DOI:10.1006/csla.2000.0138]
13. [13] A. Graves, S. Fernández, F. Gomez and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," In Proceedings of the 23rd international conference on Machine learning, pp. 369-376, 2006. [DOI:10.1145/1143844.1143891]
14. [14] H. Hadian, H. Sameti, D. Povey and S. Khudanpur, "Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 11, pp. 1949-1961, 2018. [DOI:10.1109/TASLP.2018.2848701]
15. [15] P. Haghighi, N. Nobile, C. He and C. Suen, "A New Large-Scale Multi-purpose Handwritten Farsi Database", Lecture Notes in Computer Science, pp. 278-286, 2009. [DOI:10.1007/978-3-642-02611-9_28]
16. [16] M. Hamdani, A. Mousa and H. Ney, "Open Vocabulary Arabic Handwriting Recognition Using Morphological Decomposition", 2013 12th International Conference on Document Analysis and Recognition, 2013. [DOI:10.1109/ICDAR.2013.63]
17. [17] S. K. Jemni, Y. Kessentini, S. Kanoun, J. Ogier, "Offline Arabic Handwriting Recognition Using BLSTMs Combination.", In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 31-36, 2018. [DOI:10.1109/DAS.2018.54]
18. [18] S. Khorashadizadeh, A. Latif, "Arabic/Farsi Handwritten Digit Recognition usin Histogra of Oriented Gradient and Chain Code Histogram", International Arab Journal of Information Technology (IAJIT), vol. 13, no. 4, 2016.
19. [19] H. Khosravi, E. Kabir, "Introducing a very large dataset of handwritten Farsi digits and a study on their varieties.", Pattern recognition letters, vol. 28, no. 10, pp. 1133-1141. 2007. [DOI:10.1016/j.patrec.2006.12.022]
20. [20] D. Lee, S. Ismael, S. Grimes, D. Doermann, S. Strassel, Z. Song, "MADCAT Phase 1 Training Set", LDC2012T15. DVD. Philadelphia: Linguistic Data Consortium, 2012.
21. [21] S. A. Mahmoud, I. Ahmad, M. Alshayeb, W. G. Al-Khatib, M. T. Parvez, G. A. Fink, V. Margner, H. E. Abed, "KHATT: Arabic offline handwritten text database," In 2012 International Conference on Frontiers in Handwriting Recognition (ICFHR 2012), pp. 449-454, 2012. [DOI:10.1109/ICFHR.2012.224] [PMCID]
22. [22] S. Mozaffari, H. E. Abed, V. Märgner, K. Faez, A. Amirshahi. "IfN/Farsi-Database: a database of Farsi handwritten city names." In International Conference on Frontiers in Handwriting Recognition. 2008.
23. [23] S. Mozaffari, K. Faez, F. Faradji, M. Ziaratban, S. M. Golzan, "A comprehensive isolated Farsi/Arabic character database for handwritten OCR research," In Tenth International Workshop on Frontiers in Handwriting Recognition, Suvisoft, 2006.
24. [24] S. Mozaffari, K. Faez and M. Ziaratban, "Structural decomposition and statistical description of Farsi/Arabic handwritten numeric characters", Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005. [DOI:10.1109/ICDAR.2005.221]
25. [25] M. Pechwitz, S. S. Maddouri, V. Märgner, N. Ellouze, H. Amiri. "IFN/ENIT-database of handwritten Arabic words," In Proc. of CIFED, vol. 2, pp. 127-136. 2002.
26. [26] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, K. Vesely, "The Kaldi speech recognition toolkit," In IEEE 2011 workshop on automatic speech recognition and understanding, IEEE Signal Processing Society, 2011.
27. [27] D. Povey, V. Peddinti, D. Galvez, P. Ghahrmani, V. Manohar, X. Na, Y. Wang, and S. Khudanpur, "Purely sequence-trained neural networks for asr based on lattice-free mmi," in Interspeech, 2016. [DOI:10.21437/Interspeech.2016-595]
28. [28] D. Povey and P. C. Woodland, "Minimum phone error and i-smoothing for improved discriminative train- ing," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP, vol. 1. IEEE, pp. I-105, 2002. [DOI:10.1109/ICASSP.2002.1005687]
29. [29] D. Rybach, S. Hahn, P. Lehnen, D. Nolden, M. Sundermeyer, Z. Tüske, S. Wiesler, R. Schlüter, H Ney, "Rasr-the rwth aachen university open source speech recognition toolkit," In Proc. IEEE Automatic Speech Recognition and Understanding Workshop. 2011.
30. [30] R. Sabzi, Z. Fotoohinya, A. Khalili, S. Golzari, Z. Salkhorde, S. Behravesh, S. Akbarpour, "Recognizing Persian handwritten words using deep convolutional networks," in Artificial Intelligence and Signal Processing Conference (AISP), pp. 85-90, 2017. [DOI:10.1109/AISP.2017.8324114]
31. [31] J. Sadri, C. Y. Suen, and T. D. Bui, "Application of support vector machines for recognition of handwritten Arabic/Persian digits," In Proceedings of Second Iranian Conference on Machine Vision and Image Processing, vol. 1, pp. 300-307. 2003.
32. [32] J. Sadri, M. R. Yeganehzad, J. Saghi, "A novel comprehensive database for offline Persian handwriting recognition.", Pattern Recognition, vol. 60, pp. 378-393, 2016. [DOI:10.1016/j.patcog.2016.03.024]
33. [33] R. Safabaksh, A. Ghanbarian and G. Ghiasi, "HaFT: A handwritten Farsi text database", 2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP), 2013. [DOI:10.1109/IranianMVIP.2013.6779956]
34. [34] H. Sajedi, "Handwriting recognition of digits, signs, and numerical strings in Persian", Computers & Electrical Engineering, vol. 49, pp. 52-65, 2016. [DOI:10.1016/j.compeleceng.2015.11.030]
35. [35] H. Sak, O. Vinyals, G. Heigold, A. Senior, E. McDermott, R. Monga, and M. Mao, "Sequence discriminative distributed training of long short-term memory recurrent neural networks," in Interspeech, 2014.
36. [36] H. Soltanzadeh, M Rahmati, "Recognition of Persian handwritten digits using image profiles of multiple orientations," Pattern Recognition Letters, vol. 25, no. 14, pp. 1569-1576, 2004. [DOI:10.1016/j.patrec.2004.05.014]
37. [37] F. Stahlberg and S. Vogel, "Detecting dense foreground stripes in Arabic handwriting for accurate baseline positioning", 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015. [DOI:10.1109/ICDAR.2015.7333784]
38. [38] F. Stahlberg and S. Vogel, "The QCRI Recognition System for Handwritten Arabic", Image Analysis and Processing, pp. 276-286, 2015. [DOI:10.1007/978-3-319-23234-8_26]
39. [39] P. Voigtlaender, P. Doetsch, S. Wiesler, R. Schlüter, and H. Ney, "Sequence-discriminative training of re- current neural networks," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 2100-2104, 2015. [DOI:10.1109/ICASSP.2015.7178341]
40. [40] K. Veselỳ, A. Ghoshal, L. Burget, and D. Povey, "Sequence-discriminative training of deep neural net- works," in INTERSPEECH, pp. 2345-2349, 2013.
41. [41] S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, P. Woodland, The HTK Book (for version 3.4). Cambridge Univ. Eng. Dept., 2009.
42. [42] M. Ziaratban, K. Faez and F. Bagheri, "FHT: An Unconstraint Farsi Handwritten Text Database", 2009 10th International Conference on Document Analysis and Recognition, pp. 281-285, 2009. [DOI:10.1109/ICDAR.2009.56]
43. [43] بایسته، تاشک الهام، احمدی فرد، علیرضا و خسروی، حسین "یک روش دو مرحله¬ای برای بازشناسی کلمات دست¬نوشته فارسی به کمک بلوک¬بندی تطبیقی گرادیان تصویر" مجله علمي و پژوهشي پردازش علائم و داده‌ها. ۱۳۹۴; ۱۲ (۳) :۱۵-۲۹
44. [43] E. Bayesteh Tashk, A. Ahmadifard and H. khosravi, ''A two step method for offline handwritten Farsi word recognition using adaptive division of gradient image'', JSDP, Vol.12 (3), pp.15-29, 2015.

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.