In this study, recognition rates of consonants available in vowel-consonant-vowel structure in hearing tests and two microscopic models will be investigated. Such a syllable structure doesn’t exist in Farsi and Azerbaijani languages, but since the goal is only recognition of middle phoneme, according to hearing tests, listeners are able to properly recognize phonemes in clean speech conditions. Inasmuch as these syllable structures are meaningless, it will be suitable for our purpose that is only determination of recognition rates of phonemes not meaningful words. Using this corpus, listeners’ linguistic knowledge in prediction of words is disregarded. Results of hearing tests are compared with two microscopic models based on human auditory system. Difference between two models is at the final stage of feature extraction that in first model, a 8 Hz filter and in the second model a modulation filterbank is used. Correct recognition rates of phonemes in different signal to noise ratios and two distance metrics for speech recognizer, will be compared. In this study recognition rates of consonants for listeners with Azerbaijani native language have been studied. Beside the empirical aspect of the paper, the innovations of this work lies in the study of using two different distance measures for Holube’s model and also direct comparison of two microscopic models in prediction of overall recognition rates and recognition rate of each consonant.
