Volume 13, Issue 1 (6-2016)                   JSDP 2016, 13(1): 57-70 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Shekofteh Y, Gholipor H, Goodarzi M, kabudian J, Almasganj F, Reza S et al . Fast estimation of warping factor in the vocal tract length normalization using obtained scores of gender detection modeling. JSDP 2016; 13 (1) :57-70
URL: http://jsdp.rcisp.ac.ir/article-1-254-en.html
rcdat
Abstract:   (6325 Views)

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effective method introduced to cope with this variation. In this method, the speech spectrum of each speaker is frequency warped according to a specific warping factor of that speaker. In this paper, we first developed the common search-based method to obtain the appropriate warping factor over a HMM-based Persian continuous speech recognition system. Then pointing out the computational cost of search-based method, we proposed a linear regression process for estimating warping factor based on the scores generated by our gender detection system. Experimental results over a Persian conversational speech database shown an improvement about 0.54 percent in word recognition accuracy as well as a significant reduction in computational cost of estimating warping factor, compared to search-based approach.

Full-Text [PDF 2918 kb]   (1822 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2014/06/30 | Accepted: 2016/02/26 | Published: 2016/06/22 | ePublished: 2016/06/22

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing