1. [1] L. Lee, J. Glass, H. Lee, and C. Chan, "Spoken Content Retrieval-Beyond Cascading Speech Recognition with Text Retrieval," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 23, no. 9, pp. 1389-1420, Sep. 2015. [
DOI:10.1109/TASLP.2015.2438543]
2. [2] M. Larson and G. J. F. Jones, "Spoken Content Retrieval: A Survey of Techniques and Technologies," Found. Trends® Inf. Retr., vol. 5, no. 3, pp. 235-422, 2012. [
DOI:10.1561/1500000020]
3. [3] J. G. Fiscus, J. Ajot, J. S. Garofolo, and G. Doddingtion, "Results of the 2006 Spoken Term Detection Evaluation," Proc. ACM SIGIR Work. Search. Spontaneous Conversational., pp. 51-55, 2006.
4. [4] J. Tejedor et al., "ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish," EURASIP J. Audio, Speech, Music Process., vol. 2017, no. 1, p. 22, 2017. [
DOI:10.1186/s13636-017-0119-z]
5. [5] J. S. Garofolo, C. G. P. Auzanne, and E. M. Voorhees, "The TREC Spoken Document Retrieval Track: A Success Story," Proc. TREC-8, vol. 8940, no. 500-246, pp. 109-130, 1999.
6. [6] J. Trmal et al., "The Kaldi OpenKWS System : Improving Low Resource Keyword Search," Interspeech2017, pp. 3597-3601, 2017. [
DOI:10.21437/Interspeech.2017-601]
7. [7] X. Anguera, L. J. Rodriguez-Fuentes, A. Buzo, F. Metze, I. Szoke, and M. Penagarikano, "QUESST2014: Evaluating Query-by-Example Speech Search in a zero-resource setting with real-life queries," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2015-Augus, pp. 5833-5837, 2015. [
DOI:10.1109/ICASSP.2015.7179090]
8. [8] T. Alumäe et al., "The 2016 BBN Georgian telephone speech keyword spotting system," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 5755-5759, 2017. [
DOI:10.1109/ICASSP.2017.7953259]
9. [9] Z. Gomar, Discriminative Articulatory Models for Spoken Term Detection in Low-Resource Conditions, M.S. Thesis, Sharif University of Technology, 2016.
10. [10] M. Crochemore, "Transducers and repetitions," Theor. Comput. Sci., vol. 45, pp. 63-86, 1986. [
DOI:10.1016/0304-3975(86)90041-1]
11. [11] J. S. Bridle, "An efficient elastic-template method for detecting given words in running speech," Brit. Acoust. Soc. Meet., pp. 1-4, 1973.
12. [12] A. Mandal, K. R. Prasanna Kumar, and P. Mitra, "Recent developments in spoken term detection: a survey," Int. J. Speech Technol., vol. 17, no. 2, pp. 183-198, Jun. 2014. [
DOI:10.1007/s10772-013-9217-1]
13. [13] J. Bridle, "An efficient elastic template method for detecting given keywords in the running speech," Proc. Br. Acoust. Soc. Meet., pp. 1-4, 1973.
14. [14] C. Parada, A. Sethy, and B. Ramabhadran, "Query-by-example spoken term detection for OOV terms," Proc. 2009 IEEE Work. Autom. Speech Recognit. Understanding, ASRU 2009, pp. 404-409, 2009. [
DOI:10.1109/ASRU.2009.5373341]
15. [15] J. Tejedor, I. Szöke, and M. Fapso, "Novel methods for query selection and query combination in query-by-example spoken term detection," Proc. 2010 Int. Work. Search. spontaneous conversational speech - SSCS '10, pp. 15-20, 2010. [
DOI:10.1145/1878101.1878106]
16. [16] M. C. Madhavi and H. A. Patil, "Partial matching and search space reduction for QbE-STD," Comput. Speech Lang., vol. 45, pp. 58-82, Sep. 2017. [
DOI:10.1016/j.csl.2017.03.004]
17. [17] Y. Zhang and J. R. Glass, "Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams," Proc. 2009 IEEE Work. Autom. Speech Recognit. Understanding, ASRU 2009, pp. 398-403, 2009. [
DOI:10.1109/ASRU.2009.5372931]
18. [18] M. Huijbregts, M. McLaren, and D. Van Leeuwen, "Unsupervised acoustic sub-word unit detection for query-by-example spoken term detection," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 4436-4439, 2011. [
DOI:10.1109/ICASSP.2011.5947338]
19. [19] P. Fousek and H. Hermansky, "Towards ASR Based on Hierarchical Posterior-Based Keyword Recognition," 2006 IEEE Int. Conf. Acoust. Speed Signal Process. Proc., vol. 1, pp. I-433-I-436.
20. [20] H. Sakoe and S. Shiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Trans. Acoust. Speech Signal Process., vol. 26, no. 1, pp. 43-49, 1978. [
DOI:10.1109/TASSP.1978.1163055]
21. [21] C. Chan and L. Lee, "Unsupervised Spoken-Term Detection with Spoken Queries Using Segment-based Dynamic Time Warping," Evaluation, no. September, pp. 693-696, 2010.
22. [22] D. Ram, L. Miculicich, and H. Bourlard, "CNN based query by example spoken term detection," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, vol. 2018-September, pp. 92-96, 2018. [
DOI:10.21437/Interspeech.2018-1722]
23. [23] C. W. Ao and H. Y. Lee, "Query-by-example spoken term detection using attention-based multi-hop networks," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2018-April, pp. 6264-6268, 2018.
24. [24] R. C. Rose and D. B. Paul, "A hidden Markov model based keyword recognition system," Int. Conf. Acoust. Speech, Signal Process., pp. 129-132 vol.1, 1990.
25. [25] J. G. Wilpon, L. R. Rabiner, C.-H. Lee, and E. R. Goldman, "Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models," Ieee Taslp, vol. 3, no. I, pp. 1870-1878, 1990. [
DOI:10.1109/29.103088]
26. [26] A. Tavanaei, H. Sameti, and S. H. Mohammadi, "False alarm reduction by improved filler model and post-processing in speech keyword spotting," IEEE Int. Work. Mach. Learn. Signal Process., 2011. [
DOI:10.1109/MLSP.2011.6064588]
27. [27] R. Sukkar and J. Wilpon, "A two pass classifier for utterance rejection in Keyword Spotting," Acoust. Speech, Signal …, pp. 1-4, 1993. [
DOI:10.1109/ICASSP.1993.319338]
28. [28] M. G. Rahim, C. H. Lee, and B. H. Juang, "Discriminative utterance verification for connected digits recognition," IEEE Trans. Speech Audio Process., vol. 5, no. 3, pp. 266-277, 1997. [
DOI:10.1109/89.568733]
29. [29] "KWS16 Evaluation Plan." [Online]. Available: https://www.nist.gov/%0Asites/default/files/documents/itl/iad/mig/KWS16-evalplan-v04.pdf.
30. [30] C. Chelba, J. Silva, and A. Acero, "Soft indexing of speech content for search in spoken documents," Comput. Speech Lang., vol. 21, no. 3, pp. 458-478, Jul. 2007. [
DOI:10.1016/j.csl.2006.09.001]
31. [31] G. Chen, O. Yilmaz, J. Trmal, D. Povey, and S. Khudanpur, "Using proxies for OOV keywords in the keyword search task," in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013, pp. 416-421. [
DOI:10.1109/ASRU.2013.6707766]
32. [32] T. K. Chia, K. C. Sim, H. Li, and H. T. Ng, "Statistical lattice-based spoken document retrieval," ACM Trans. Inf. Syst., vol. 28, no. 1, pp. 1-30, Jan. 2010. [
DOI:10.1145/1658377.1658379]
33. [33] Y. C. Pan and L. S. Lee, "Performance analysis for lattice-based speech indexing approaches using words and subword units," IEEE Trans. Audio, Speech Lang. Process., vol. 18, no. 6, pp. 1562-1574, 2010. [
DOI:10.1109/TASL.2009.2037404]
34. [34] W. Hartmann, V. B. Le, A. Messaoudi, L. Lamel, and J. L. Gauvain, "Comparing decoding strategies for subword-based keyword spotting in low-resourced languages," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, no. September, pp. 2764-2768, 2014.
35. [35] L. S. Lee and Y. C. Pan, "Voice-based information retrieval - How far are we from the text-based information retrieval?," Proc. 2009 IEEE Work. Autom. Speech Recognit. Understanding, ASRU 2009, pp. 26-43, 2009. [
DOI:10.1109/ASRU.2009.5372952]
36. [36] D. Can, "Indexation, retrieval & decision techniques for spoken term detection," PhD diss, Boğaziçi University, 2010.
37. [37] M. Qadiri Nia, "Design and Performance Improvement of a Spoken Term Detection System", M.S. thesis, Sharif Universitt of Technology, 2015.
38. [38] M. Abbassian, "Keword Spotting in Persian Speech Using a Hybrid Model of DNN and HMM", M.S. thesis, Amir Kabir University of Technology, 2017.
39. [39] S.S. Sarfjou, Introducing a New Information Retrieval Framework for Persian Speech Retrieval, M.S. thesis, Qom University, 2012.
40. [40] M.Y. Akhlaqi, "Introducing a New Information Retrieval Method for Speech Recognized Texts, M.S. thesis, Qom University, 2014.
41. [41] M.H. Soltani, ''Introducing a New Information Retrieval Framework for Speech Retrieval'' M.s. thesis, Qom University, 2014.
42. [42] H. Naderi, Keyword Spotting in Speech Utterance, M.S. thesis, Shahrood University of Technology, 2013.
43. [43] M. Bijankhan, J. Sheikhzadegan, M. R. Roohani, Y. Samareh, C. Lucas, and M. Tebyani, "FARSDAT-The speech database of Farsi spoken language," Proc. Aust. Conf. Speech Sci. Technol., vol. 2, no. 0, pp. 826-831, 1994.
44. [44] J. Sheikhzadegan and M. Bijankhan, "Persian speech databases," 2nd Work. Persian Lang. Comput., pp. 247-261, 2006.
45. [45] M. Bijankhan, J. Sheykhzadegan, M. R. Roohani, R. Zarrintare, S. Z. Ghasemi, and M. E. Ghasedi, "Tfarsdat - The telephone farsi speech database," EUROSPEECH 2003 - 8th Eur. Conf. Speech Commun. Technol., 2003.
46. [46] D. Can and M. Saraclar, "Lattice Indexing for Spoken Term Detection," IEEE Trans. Audio. Speech. Lang. Processing, vol. 19, no. 8, pp. 2338-2347, Nov. 2011. [
DOI:10.1109/TASL.2011.2134087]
47. [47] Z. Lv, J. Kang, W. Q. Zhang, and J. Liu, "An LSTM-CTC based verification system for proxy-word based OOV keyword search," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 5655-5659, 2017. [
DOI:10.1109/ICASSP.2017.7953239]
48. [48] C. Parada, A. Sethy, and B. Ramabhadran, "Balancing false alarms and hits in spoken term detection," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 5286-5289, 2010. [
DOI:10.1109/ICASSP.2010.5494966]
49. [49] Z. Victor, S. Seneff, and J. Glass, "TIMIT acoustic-phonetic continuous speech corpus," Speech Commun., vol. 9, no. 4, pp. 351-56, 1990. [
DOI:10.1016/0167-6393(90)90010-7]
50. [50] B. BabaAli, ''State-of-the-art and Efficient Framework for Persian Speech Recognition, jsdp , Vol (3), pp. 51-62, 2017. [
DOI:10.18869/acadpub.jsdp.13.3.51]
51. [51] M. Eslami, M. Sharifi Atashgah, S. Alizade , T, Zandi, ''Persian Generative Lexicon'' Proceedings of the first Persian language and computer research workshop, 2005.
52. [52] "هضم، Hazm." [Online]. Available: https://github.com/sobhe/hazm.
53. [53] M. Federico, N. Bertoldi, and M. Cettolo, "IRSTLM: An open source toolkit for handling large scale language models," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, pp. 1618-1621, 2008.
54. [54] A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, "The DET Curve in Assessment of Detection Task Performance," Proc. Eurospeech '97, pp. 1895-1898, 1997.
55. [55] D. Jurafsky and J. H. Martin, Speech and language processing. 1999.
56. [56] D. Povey et al., "The subspace Gaussian mixture model - A structured model for speech recognition," Comput. Speech Lang., vol. 25, no. 2, pp. 404-439, 2011. [
DOI:10.1016/j.csl.2010.06.003]
57. [57] J. Tejedor, D. Wang, J. Frankel, S. King, and J. Colás, "A comparison of grapheme and phoneme-based units for Spanish spoken term detection," Speech Commun., vol. 50, no. 11-12, pp. 980-991, 2008. [
DOI:10.1016/j.specom.2008.03.005]
58. [58] Y. Wang and F. Metze, "An in-depth comparison of keyword specific thresholding and sum-to-one score normalization," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, pp. 2474-2478, 2014.