بازشناسی خودکار واج‌های فارسی با استفاده از مدل‌سازی واج‌گونه‌ها

احمدی, طاهره; کارشناس, حسین; باباعلی, باقر; علی‌نژاد, بتول

doi:10.29252/jsdp.17.3.37

***************«بسم الله الرحمن الرحیم» نشریه علمی «پردازش علائم و داده‌ها» با مجوز رسمی از کمیسیون نشریات وزارت علوم، تحقیقات و فناوری، صاحب امتیاز: پژوهشگاه توسعه فناوری‌های پیشرفته ***************

Signal and Data Processing Journal A scientific journal officially licensed by the Commission for Scientific Publications of the (MSRT). Publisher: Research Ceter for Developmen of Technologies

EN FA

دوره 17، شماره 3 - ( 9-1399 ) جلد 17 شماره 3 صفحات 54-37 | برگشت به فهرست نسخه ها

‎ 10.29252/jsdp.17.3.37

Mendeley

Zotero

RefWorks

Ahmadi T, Karshenas H, Babaali B, Alinejad B. Allophone-based acoustic modeling for Persian phoneme recognition. JSDP 2020; 17 (3) :37-54
URL: http://jsdp.rcisp.ac.ir/article-1-903-fa.html

احمدی طاهره، کارشناس حسین، باباعلی باقر، علی‌نژاد بتول. بازشناسی خودکار واج‌های فارسی با استفاده از مدل‌سازی واج‌گونه‌ها. پردازش علائم و داده‌ها. 1399; 17 (3) :37-54

URL: http://jsdp.rcisp.ac.ir/article-1-903-fa.html

بازشناسی خودکار واج‌های فارسی با استفاده از مدل‌سازی واج‌گونه‌ها

طاهره احمدی

، حسین کارشناس

، باقر باباعلی

، بتول علی‌نژاد^*

دانشکده زبان‌های خارجی، دانشگاه اصفهان

چکیده: (5618 مشاهده)

یکی از مراحل زیربنایی در بازشناسی خودکار گفتار، بازشناسی واج‌ها و از موانع جدی برای بازشناسی واج‌ها، هم‌تولیدی است. یک روش برای جبران تأثیر هم‌تولیدی، استفاده از مدل‌های وابسته به بافت در بازشناسی واج‌هاست. در این پژوهش، از یک روش زبان‌شناختی برای مدل‌سازی واج‌گونه‌ها استفاده شده است. بدین‌منظور ابتدا قواعد وقوع واج‌گونه‌ها در زبان فارسی استخراج و مشخص شده است که هر واج چه واج‌گونه‌هایی دارد. برای مدل‌سازی و شناسایی واج‌گونه‌ها، یک پیکره واج‌گونه‌ای لازم است که به‌‌منظور تولید آن، از پیکره فارس‌دات کوچک استفاده و برچسب‌گذاری واج‌گونه‌ای آن انجام و از این پیکره‌، برای مدل‌سازی و سپس شناسایی واج‌گونه‌های مختلف گفتار ورودی استفاده شده است. درنهایت، با قرار‌گرفتن هر یک از واج‌گونه‌های شناسایی‌شده در دسته واجی مربوط به خود، بازشناسی واج‌ها از مسیر واج‌گونه‌ها انجام شده است. با این روش، دقت بازشناسی واج‌ها در زبان فارسی در مقایسه با بهترین نتایج گزارش‌شده تاکنون، بهبود قابل‌ملاحظه‌ای نشان داده است.

واژه‌های کلیدی: بازشناسی خودکار گفتار، بازشناسی خودکار واج، مدل‌های وابسته به بافت، واج، واج‌گونه، هم‌تولیدی

متن کامل [PDF 5608 kb] (1250 دریافت)

نوع مطالعه: كاربردي | موضوع مقاله: مقالات پردازش گفتار
دریافت: 1397/7/6 | پذیرش: 1398/3/1 | انتشار: 1399/9/15 | انتشار الکترونیک: 1399/9/15

فهرست منابع

1. [1] بی‌جنخان، محمود، واجشناسی نظریة بهینگی، تهران، انتشارات سمت، 1384.

2. [1] M. Bijankhan, "The phonology of optimality theory", Tehran: Samt, 2005.

3. [2] بی‌جن‌خان، محمود، نظام آوایی زبان فارسی، تهران، انتشارات سمت، 1392.

4. [2] M. Bijankhan, "Phonetic system of Persian language", Tehran: Samt, 2013.

5. [3] ثمره، یدالله، آواشناسی زبان فارسی (ویرایش دوم)، تهران، مرکز نشر دانشگاهی، صفحات 37-79، 1378.

6. [3] Y. Samareh, "Phonetics of Persian language", Tehran: Academic publishing center, 1999.

7. [4] حق‌شناس، علی‌محمد، آواشناسی(فونتیک)، تهران، انتشارات آگه، صفحات 69-131، 156، 159، 1392.

8. [4] A. M. Haghshenas, "Phonetic", Tehran: Agah, 2013.

9. [5] دیهیم، گیتی، درآمدی بر آواشناسی عمومی، تهران، انتشارات دانشگاه ملی ایران «159»، 1358.

10. [5] G. Deihaim, "An introduction to General Phonetics", Tehran: National university of Iran, 159, 1979.

11. [6] سپنتا، ساسان، آواشناسی فیزیکی زبان فارسی، اصفهان، انتشارات گل¬ها، 1377.

12. [6] S. Sepanta, "Acoustic phonetics of Persian language", Isfahan: Golha, 1998.

13. [7] علی¬نژاد، بتول و حسینی بالام، فهیمه، مبانی آواشناسی آکوستیکی، اصفهان، انتشارات دانشگاه اصفهان، 1392.

14. [7] B. Alinejad, F. Hosseini Balam, Fundamentals of acoustic phonetics", Isfahan: university of Isfahan, 2013.

15. [8] علینژاد، بتول، مبانی واجشناسی، اصفهان، انتشارات دانشگاه اصفهان، 1395.

16. [8] B. Alinejad, Fundamentals of phonology, Isfshsn: university of Isfahan, 2016.

17. [9] کرد زعفرانلو کامبوزیا، عالیه، واج‌شناسی رویکردهای قاعده بنیاد، تهران، انتشارات سمت، 1392.

18. [9] A. Kodr Zafaranloo Kambozia, "phonology rule-based approach", Tehran: Samt, 2013.

19. [10] مدرسی قوامی، گلناز، آواشناسی: بررسی علمی گفتار، تهران، انتشارات سمت، صفحه 72، 1390.

20. [10] G. Modarresi Ghavami, "Phonetics: The scientific study of speech", Tehran: Samt, 2011.

21. [11] مشکوةالدینی، مهدی، ساخت آوایی زبان (ویرایش سوم)، مشهد، انتشارات دانشگاه فردوسی مشهد، صفحه 131، 1388.

22. [11] M. Meshkato Dini, "The sound pattern of language (third edition)", Mashhad: Ferdowsi University of Mashhad, p. 131, 2009.

23. [12] یارمحمدی، لطف¬الله، درآمدی به آواشناسی، تهران، مرکز نشر دانشگاهی، 1364.

24. [12] L. Yarmohammadi, "An iIntroduction to phonetics", Tehran: university publication center, 1985.

25. [13] میرسعیدی، عاطفه¬سادات، "بررسی صوتشناختی فرایندهای واجی همگونی و ناهمگونی در فارسی محاوره"، پایان¬نامه¬ی دکتری، دانشگاه اصفهان، اصفهان، 1390.

26. [13] A. S. Mirsaeidi, "phonetic study of phonological process assimilation and dissimilation in Persian", PH.D dissertation, fgn, Isf., Isfahan., 2011.

27. [14] بی‌جن¬خان، محمود، "نظام واج‌گونه¬های زبان فارسی در چارچوب نظریه¬ی واج¬شناسی تولیدی"، مجله¬ی دانشکده¬ی ادبیات و علوم انسانی دانشگاه تهران، زمستان، صفحات 95- 117، 1379.

28. [14] M. Bijankhan, "Persian allophones system in the framework of articulatory phonemics theory", Journal of the faculty of literature and humanities, winter, pp. 95-117, 2001.

29. [15] زاهدی، کیوان و فخاریان، فیضیه، "همگونی همخوان‌ها در زبان فارسی نوین: رویکرد واجشناسی هندسهی مشخصهها"، مجله¬ی پژوهش¬های زبانشناسی دانشگاه اصفهان، پاییز و زمستان، شماره¬ی 2، صفحات 47- 64، 1390.

30. [15] K. Zahedi, F. Fakharian, " Consonantal Assimilation in Modern Persian: A Feature Geometry Approach", Journal of researches in linguistics, autumn-winter, Issue 2, pp. 47-64, 2010.

31. [16] صادقی، وحید، "تأثیر دمش بر تقابل واکداری- بیواکی انسدادی¬های فارسی"، مجله¬ی زبان و زبانشناسی. صفحات 65- 84، 1386.

32. [16] V. Sadeghi, "The effect of aspiration on Persian stop voicing contrast", Journal of language and linguistics, pp. 65-84, 2007.

33. [17] صادقی، وحید، "آواشناسی و واج‌شناسی همخوانهای چاکنایی"، مجله¬ی پژوهش¬های زبانشناسی دانشگاه اصفهان، بهار و تابستان، شماره¬ی 2، صفحات 49- 62، 1389.

34. [17] V.Sadeghi, "The phonetics and phonology of Persian glottal consonants", Journal of researches in linguistics, spring and summer, issue 2, pp. 49-62, 2010.

35. [18] علی¬نژاد، بتول، "واک¬داری و دمش در زبان فارسی بر اساس نظریه¬ی واج¬شناسی حنجره¬ای"، فصلنامهی علمی-پژوهشی پژوهش¬های زبان¬شناسی دانشگاه اصفهان، بهار و تابستان، شماره¬ی 2، صفحات 63- 80، 1389.

36. [18] B. Alinejad, "Persian aspiration and voicing in laryngeal phonology", Journal of researches in linguistics, spring and summer, issue 2, pp. 63-80, 2010.

37. [19] مدرسی قوامی، گلناز، "خنثی¬شدگی تقابل همخوانهای انسدادی واکدار و بیواک در زبان فارسی"، مجموعه مقالات دانشگاه علامه طباطبایی، شماره¬ی 219، صفحات 441- 454، 1386.

38. [19] G. Modarresi Ghavami, "Neutralization of contradiction between voiced and unvoiced stop consonants in Persian language", Journal of proceeding of Allameh Tabatabaee university, issue 219, pp. 441-454, 2007.

39. [20] نوربخش، ماندانا، "همخوان ملازی در فارسی معیار"، فصل¬نامه¬ی علمی- پژوهشی زبانپژوهی دانشگاه الزهرا، تابستان، شماره¬ی 15، صفحات 151- 170، 1394.

40. [20] M. Norbakhsh, "uvular consonants in standard Persian", Journal of language research Zabanpazhuhi, issue 15, pp. 151-170, 2015.

41. [21] شریفی آتشگاه، مسعود و صادقی، وحید، "طراحی الگوریتم بازشناسی واج‌ها با به‌کارگیری همبسته‌های آکوستیکی مشخصه‌های واجی"، فصل‌نامه‌ی علمی-پژوهشی پردازش علائم و داده‌ها، شمارهی 16، صفحات 13-28، 1390.

42. [21] M. Sharifi, V. Sadeghi, "phoneme recognition algorithm design using the acoustic correlates of the phonological features", Journsl of signal and data processing, Vol. 2 (SERIAL 16), pp. 13-28, 2011.

43. [22] الماس‌گنج، فرشاد، سیدصالحی، سید علی و بیجن‌خان، محمود، "نرم‌افزار بازشناسی گفتار پیوسته فارسی: شنوا2"، اولین کارگاه پژوهشی زبان فارسی و رایانه، صفحات 77-82، تهران، 1383.

44. [22] F. Almasganj, SA. Seyyed Salehi, M. Bijankhan, "Shenava 2: a Persian continuous speech recognition software", in the first workshop on Persian language and computer, pp. 77-82, Tehran, 2004.

45. [23] صامتی، حسین و بحرانی، محمد، "استخراج و مدل‌سازی واحدهای آوایی وابسته به بافت برای بهبود دقت بازشناسی گفتار پیوسته با روش دسته‌بندی واج‌ها"، نشریه مهندسی برق و مهندسی کامپیوتر ایران، سال 3، شماره 1، تهران، 1384.

46. [23] H. Sameti, M. Bahrani, "Extraction and modeling context dependent phone units for improvement of continuous speech recognition accuracy by phonemes clustering", Journal of electrical engineering and computer engineering of Iran, spring-summer, year 3, No. 1, pp. 45-51, 2005.

47. [24] احمدی، طاهره، کارشناس، حسین، علی‌نژاد، بتول و نقوی راوندی، مصطفی، " تقطیع هجایی خودکار واژه‌های زبان فارسی بر اساس اصول هجابندی پولگرام"، مقاله‌ی ارائه‌شده در پنجمین کنفرانس بین‌المللی مطالعات زبان، ایران، دانشگاه علامه طباطبایی، 1396.

48. [24] T. Ahmadi, H. Karshenas, B. Alinejad, M. Naghavi Ravandi, "Automatic syllabification of Persian words based on Pulgram principles", In the fifth international conference of language studies, Iran, Allameh Tabatabaee university, 2017.

49. [25] باباعلی، باقر، "پایه‌گذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی"، فصل‌نامه‌ی علمی-پژوهشی پردازش علائم و داده‌ها، 13(3)، صفحات 51-62، ۱۳۹۵. [DOI:10.18869/acadpub.jsdp.13.3.51]

50. [25] B. Babaali, "A state-of-the-art and efficient framework for Persian speech recognition", Research center of intelligent signal processing, Vol. 13(3), pp. 51-62, 2016.

51. [26] D. Yu, L. Deng, "Automatic speech recognition, a deep learning approach". Springer, pp. 1-2, London, 2016.

52. [27] S. Karpagavalli, E. Chandra, "A Review on Automatic Speech Recognition Architecture and Approaches", International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 9(4), pp. 393-404, 2016. [DOI:10.14257/ijsip.2016.9.4.34]

53. [28] S. Sun, B. Zhang, L. Xie, Y. Zhang, "An unsupervised deep domain adaptation approach for robust speech recognition", Neurocomputing, pp. 79-87, Sep 27, 2017. [DOI:10.1016/j.neucom.2016.11.063]

54. [29] L. Toth, I. Hoffmann, G. Gosztolya, V. Vincze, G. Szatloczki, Z. Banreti, M. Pákáski, J. Kalman, "A speech recognition-based solution for the automatic detection of mild cognitive impairment from spontaneous speech", Current Alzheimer Research, vol. 15(2), pp. 130-138, Feb 1, 2018. [DOI:10.2174/1567205014666171121114930] [PMID] [PMCID]

55. [30] S. Sinha, SS. Agrawal, A. Jain, "Continuous density hidden markov model for hindi speech recognition", GSTF Journal on Computing (JoC), vol. 3(2), Jan 19, 2018. [DOI:10.7603/s40601-013-0015-z]

56. [31] CH. You, MA. Bin, "Spectral-domain speech enhancement for speech recognition", Speech Communication, pp. 30-41, Nov 1, 2017. [DOI:10.1016/j.specom.2017.08.007]

57. [32] R. Lileikytė, L. Lamel, JL. Gauvain, A. Gorin, "Conversational telephone speech recognition for Lithuanian", Computer Speech & Language, pp. 71-82, May 31, 2018. [DOI:10.1016/j.csl.2017.11.005]

58. [33] Z. Chen, J. Droppo, J. Li, W. Xiong, Z. Chen, J. Droppo, J. Li, W. Xiong, "Progressive joint modeling in unsupervised single-channel overlapped speech recognition", IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol.26(1), pp. 184-196, Jan 1, 2018. [DOI:10.1109/TASLP.2017.2765834]

59. [34] SC. Sajjan, C. Vijaya, "Continuous Speech Recognition of Kannada language using triphone modeling", In International Conference: Wireless Communications, Signal Processing and Networking (WiSPNET), Mar 23, 2016, pp. 451-455. [DOI:10.1109/WiSPNET.2016.7566174]

60. [35] A. Shaukat, H. Ali, U. Akram, "Automatic Urdu Speech Recognition using Hidden Markov Model", In International Conference: Image, Vision and Computing (ICIVC), Aug 3, 2016, pp. 135-139.

61. [36] J. Xu, J. Pan, Y. Yan, "Agglutinative language speech recognition using automatic allophone deriving", Chinese Journal of Electronics, vol.25(2), pp. 328-333, Mar 1, 2016. [DOI:10.1049/cje.2016.03.020]

62. [37] B. Baba Ali, H. Sameti. "The sharif speaker-independent large vocabulary speech recognition system", in The 2nd Workshop on information Technology & Its Disciplines (WITID 2004), Feb 24. 2004, pp. 24-26.

63. [38] H. Sameti, H. Veisi, M. Bahrani, B. Babaali, K, Hosseinzadeh, "Nevisa: a Persian continuous speech recognition system", In Advances in Computer Science and Engineering, Springer, pp. 485-492, Berlin, Heidelberg, 2008. [DOI:10.1007/978-3-540-89985-3_60]

64. [39] H. Sameti, H. Veisi, M. Bahrani, B. Babaali, K. Hosseinzadeh, "A large vocabulary continuous speech recognition system for Persian language", EURASIP Journal on Audio, Speech, and Music Processing, Dec 1, 2011. [DOI:10.1186/1687-4722-2011-426795]

65. [40] KE. Kafoori, SM. Ahadi. "Bounded cepstral marginalization of missing data for robust speech recognition", Computer Speech & Language, pp. 1-23, Mar 1, 2016. [DOI:10.1016/j.csl.2015.07.005]

66. [41] HR. Seresht, SM. Ahadi, S. Seyedin. "Spectro-temporal power spectrum features for noise robust ASR", Circuits, Systems, and Signal Processing, vol. 36(8), pp. 3222-3242, Aug 1, 2017. [DOI:10.1007/s00034-016-0434-0]

67. [42] KE. Kafoori, SM. Ahadi, "Robust Recognition of Noisy Speech Through Partial Imputation of Missing Data", Circuits, Systems, and Signal Processing, vol.37(4), pp. 1625-1648, Apr 1, 2018. [DOI:10.1007/s00034-017-0616-4]

68. [43] SG. Firooz, F. Almasganj, Y. Shekofteh, "Improvement of automatic speech recognition systems via nonlinear dynamical features evaluated from the recurrence plot of speech signals", Computers & Electrical Engineering, pp. 215-226, 2017. [DOI:10.1016/j.compeleceng.2016.07.006]

69. [44] MM. Goodarzi, F. Almasganj, "Model-based clustered sparse imputation for noise robust speech recognition", Speech Communication, pp. 218-229, Feb 1, 2016. [DOI:10.1016/j.specom.2015.06.009]

70. [45] Y. Shekofteh, F. Almasganj, A. Daliri, "MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space", Engineering Applications of Artificial Intelligence, pp. 1-9, Sep 1, 2015. [DOI:10.1016/j.engappai.2015.05.001]

71. [46] M. Bahrani, H. Sameti, "Building statistical language models for persian continuous speech recognition systems using the peykare corpus", International Journal of Computer Processing of Languages, vol.23(01), pp. 1-20, Mar, 2011. [DOI:10.1142/S1793840611002188]

72. [47] M. Sharifi Atashgah, V. Sadeghi, "A phoneme recognition algorithm design using the acoustic correlates of the phonological features", Journsl of signal and data processing, Vol.2, (SERIAL 16), pp. 13-28, 2011.

73. [48] P. Roach, "English Phonetics and Phonology Fourth Edition: A Practical Course", Ernst Klett Sprachen, pp. 42-43, 2010.

74. [49] C. Gussenhoven, H. Jacobs, "Understanding phonology", Routledge, 2017. [DOI:10.4324/9781315267982]

75. [50] WJ. Hardcastle, J. Laver, FE. Gibbon, "The handbook of phonetic sciences", John Wiley & Sons, Feb 22, pp.316-356, 500, 783-784, 793), 2010.

76. [51] D. Recasens, "Coarticulation and sound change in Romance", John Benjamins Publishing Company, Apr 15, p. ix, 3, 2014. [DOI:10.1075/cilt.329]

77. [52] B. Kühnert, F. Nolan, "The origin of coarticulation. Coarticulation: Theory, data and techniques", pp. 7-30, 1999. [DOI:10.1017/CBO9780511486395.002]

78. [53] R. Kennedy, "Phonology: A Coursebook", Cambridge University Press, 2017. [DOI:10.1017/CBO9781107110793]

79. [54] P. Ladefoged, K. Johnson, "A course in phonetics", Nelson Education, Jan 3, p. 71, 111, 277, 2014.

80. [55] B. Heselwood, "Phonetic transcription in theory and practice", Edinburgh University Press, Oct 31, p. 151, 2013. [DOI:10.3366/edinburgh/9780748640737.001.0001]

81. [56] BS. Collins, IM. Mees, "Practical phonetics and phonology: A resource book for students", Routledge, Feb 11, p. 123, 2013. [DOI:10.4324/9780203080023]

82. [57] RM. Millar, L. Trask, "Trask's historical linguistics", Routledge, Feb 20, pp. 49-51, 2015.

83. [58] WG. Bennett, "Assimilation, dissimilation, and surface correspondence in Sundanese", Natural Language & Linguistic Theory, vol. 33(2), pp. 371-415, May 1, 2015. [DOI:10.1007/s11049-014-9268-2]

84. [59] RA. Knight, "Phonetics: A coursebook", Cambridge University Press, Jan 26, pp. 90, 103, 192-193, 2012.

85. [60] D. Jacques, "Generative and non-linear phonology", Routledge, Sep 25, pp. 298, 2014. [DOI:10.4324/9781315846903]

86. [61] C. Herff, D. Heger, A. De Pesters, D. Telaar, P. Brunner, G. Schalk, T. Schultz, "Brain-to-text: decoding spoken phrases from phone representations in the brain", Frontiers in neuroscience, pp. 217, Jun 12, 2015. [DOI:10.3389/fnins.2015.00217] [PMID] [PMCID]

87. [62] MS. Mirzaei, K. Meshgi, T. Kawahara, "Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening", Computer Speech & Language, pp. 17-36, May 1, 2018. [DOI:10.1016/j.csl.2017.11.001]

88. [63] D. Bahdanau, J. Chorowski, D. Serdyuk, P. Brakel, Y. Bengio. "End-to-end attention-based large vocabulary speech recognition", In IEEE International Conference: Acoustics, Speech and Signal Processing (ICASSP), Mar 20, 2016, pp. 4945-4949. [DOI:10.1109/ICASSP.2016.7472618]

89. [64] AR. Mohamed, F. Seide, D. Yu, J. Droppo, A. Stoicke, G. Zweig, G. Penn, "Deep bi-directional recurrent networks over spectral windows", In IEEE Workshop: Automatic Speech Recognition and Understanding (ASRU), pp. 78-83, Dec 13, 2015. [DOI:10.1109/ASRU.2015.7404777] [PMCID]

90. [65] T. Moon, H. Choi, H. Lee, I. Song, "Rnndrop: A novel dropout for rnns in asr", In IEEE Workshop: Automatic Speech Recognition and Understanding (ASRU), pp. 65-70, Dec 13, 2015. [DOI:10.1109/ASRU.2015.7404775]

91. [66] I. Himawan, P. Motlicek, D. Imseng, S. Sridharan, "Feature mapping using far-field microphones for distant speech recognition", Speech Communication, pp. 1-9, Oct 1, 2016. [DOI:10.1016/j.specom.2016.07.003]

92. [67] S. Ravuri, "Hybrid dnn-latent structured SVM acoustic models for continuous speech recognition", In IEEE Workshop: Automatic Speech Recognition and Understanding (ASRU), pp. 37-44, Dec 13, 2015. [DOI:10.1109/ASRU.2015.7404771]

93. [68] W. Chan, N. Jaitly, Q. Le, O. Vinyals, "Listen, attend and spell: A neural network for large vocabulary conversational speech recognition", In IEEE International Conference: Acoustics, Speech and Signal Processing (ICASSP), Mar 20, 2016, pp. 4960-4964. [DOI:10.1109/ICASSP.2016.7472621]

94. [69] Z. Wang, E. Vincent, R. Serizel, Y. Yan, "Rank-1 constrained multichannel Wiener filter for speech recognition in noisy environments", Computer Speech & Language, pp. 37-51, May 1, 2018. [DOI:10.1016/j.csl.2017.11.003]

95. [70] H. Barfuss, C. Huemmer, A. Schwarz, W. Kellermann, "Robust coherence-based spectral enhancement for speech recognition in adverse real- world environments", Computer Speech & Language, pp. 388-400, Nov 1, 2017. [DOI:10.1016/j.csl.2017.02.005]

96. [71] AH. Moore, PP. Parada, PA. Naylor, "Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures", Computer Speech & Language, pp. 574-84, Nov 1, 2017. [DOI:10.1016/j.csl.2016.11.003]

97. [72] A. Veiga, S. Candeias, L. Sá, F. Perdigão, "Using coarticulationrules in automatic phonetic transcription", In Proceedings of PROPOR, April, 2010.

98. [73] F. Imedjdouben, A. Houacine, "Generation of allophones for speech synthesis dedicated to the Arabic language", In First International Conference on New Technologies of Information and Communication (NTIC), 2015, pp. 1-4, Nov 8. [DOI:10.1109/NTIC.2015.7368754]

99. [74] A. Lee, T. Kawahara, K. Shikano, "Julius---an open source real-time large vocabulary recognition engine", 2001.

100. [75] "in the CU SONIC ASR system for noisy speech: The SPINE task", In IEEE International Conference: Acoustics, Speech, and Signal Processing (ICASSP'03), Vol. 1, 2003, pp. I-I.

101. [76] KF. Lee, HW. Hon, R. Reddy, "An overview of the SPHINX speech recognition system", In Readings in speech Recognition, pp. 600-610, 1990. [DOI:10.1016/B978-0-08-051584-7.50056-5]

102. [77] W. Walker, P. Lamere, P. Kwok, B. Raj, R. Singh, E. Gouvea, P. Wolf, J. Woelfel, "Sphinx-4: A flexible open source framework for speech recognition", 2004.

103. [78] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, "The Kaldi speech recognition toolkit", In IEEE workshop on automatic speech recognition and understanding (No. EPFL-CONF-192584), IEEE Signal Processing Society, 2011.

104. [79] M. Bijankhan, MJ. Sheikhzadegan, MR. Roohani, "SMALL FARSDAT-The speech database of Farsi spoken language", In Proceedings of the 5th Australian International Conference on speech science and technology, Perth, Australia, December, 1994, pp. 826-829.

105. [80] F. Almasganj, SA. Seyedsalehi, M. Bijankhan, H. Sameti, J. Sheikhzadegan, "SHENAVA-1: Persian spontaneous continuous speech recognizer", In Proceedings of the International Conference on Electrical Engineering, 2001, pp. 101-106.

106. [81] M. Caballero, A. Moreno, A. Nogueiras, "Multidialectal Spanish acoustic modeling for speech recognition", Speech Communication, vol. 51(3), pp. 217-229, 2009. [DOI:10.1016/j.specom.2008.08.003]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.