Speech recognition has achieved great improvements recently. However, robustness is still one of the big problems, e.g. performance of recognition fluctuates sharply depending on the speaker, especially when the speaker has strong accent and difference Accents dramatically decrease the accuracy of an ASR system. In this paper we apply three new methods of feature extraction including Spectral Centroid Magnitude (SCM), its first order difference (∆SCM ) and Zak transformation to the original speech signal using accents selected from FARSDAT corpus then their performance of these methods have been compared with some common methods such as MFCC. Moreover a new feature based on MFCC algorithm have been proposed in order to use in noisy environments. Five different classifications, including MLP, KNN, PNN, RBF and SVM and their combination have been used to evaluate the performance of each feature extraction methods. Experimental results demonstrate improvement in the recognition rates in our proposed method.
Rights and permissions | |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |