Signal and Data Processing

fa مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی Binaural Microscopic Model Based on Modulation Filterbank for the Prediction of Speech Intelligibility in Normal-Hearing Listeners مقالات پردازش گفتار Paper پژوهشي Research در این مطالعه، مدل پیش‌گویی قابلیت فهم دوگوشی میکروسکوپی بر مبنای فیلتربانک مدولاسیون ارائه می‌شود. تاکنون در مدل‌های دوگوشی، از معیارهای طیفی مانند STI و SII و یا دیگر روابط تحلیلی برای تعیین میزان قابلیت فهم دوگوشی استفاده شده است. در مدل پیشنهادی، بر خلاف تمام مدل‌های پیش‌گویی قابلیت فهم دوگوشی، از بازشناساگر خودکار گفتار در قسمت پایانی بهعنوان واحد تصمیم‌گیری استفاده می‌شود. یک مزیت استفاده از این روش، امکان تحلیل میزان بازشناسی قسمت‌های کوچک گفتار مانند واج و سیلاب است. مزیت دیگر این مدل استفاده از پیش‌پردازش‌هایی است که وجود آنها در دستگاه شنوایی انسان به اثبات رسیده است. با استفاده از ماتریس ویژگی پیشنهادی در بازشناساگر گفتار، این مدل دارای پیش‌گویی‌های خوبی در حضور یک منبع نوفه ایستان شبهگفتار است. مقایسه نتایج مدل با نتایج حاصل از آزمایش‌های شنوایی، مقادیر همبستگی بالا و میانگین قدر مطلق خطای پایین را نشان می‌دهد. همچنین، ماتریس‌های ابهام برای همخوان‌ها همبستگی بالایی را بین پیشگوییها و اندازه‌گیری‌ها نشان میدهد. آستانه ادراک گفتار پیش‌گوییشده توسط مدل پیشنهادی دارای میانگین قدر مطلق خطای کمتری (6/0 دسیبل) در مقایسه با مدل مبنای BSIM است. In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the <st1:stockticker w:st="on">STI</st1:stockticker> and <st1:stockticker w:st="on">SII</st1:stockticker> or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic speech recognizer (<st1:stockticker w:st="on">ASR</st1:stockticker>) is used in the back-end as the decision unit. One advantage of using this approach is the possibility of analyzing the recognition rate of small parts of speech such as phonemes and syllables. Another advantage of this model lies in the use of pre-processing that their existence in the human auditory system has been verified. Using the proposed feature matrix in the speech recognizer, this model has good predictions in the presence of one source of stationary speech-shaped noise. Comparing the results of the proposed model with those of listening tests show high correlations and low mean absolute error values. Also, the confusion matrices of the consonants represent high correlation between predictions and measurements. The predicted speech reception threshold by the proposed model has a smaller mean absolute error (0.6 dB) than the baseline model of BSIM.   پیش‌گویی قابلیت فهم گفتار, مدل‌های دوگوشی, فیلتربانک مدولاسیون, مدل‌های میکروسکوپی, مدل‌های ماکروسکوپی Prediction of Speech Intelligibility, Binaural Models, Modulation Filter bank, Microscopic Models, Macroscopic Models. 135 151 http://jsdp.rcisp.ac.ir/browse.php?a_code=A-10-813-1&slc_lang=fa&sid=1 Ali Fallah علی فلاح ali.fallah@tabrizu.ac.ir 10031947532846005232 10031947532846005232 No University of Tabriz دانشگاه تبریز Masoud Geravanchizadeh مسعود گراوانچی زاده geravanchizadeh@tabrizu.ac.ir 10031947532846005233 10031947532846005233 Yes University of Tabriz دانشگاه تبریز