Volume 15, Issue 1 (6-2018)                   JSDP 2018, 15(1): 87-102 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Boreshban Y, Yousefinasab H, Mirroshandel S A. Providing a Religious Corpus of Question Answering System in Persian. JSDP. 2018; 15 (1) :87-102
URL: http://jsdp.rcisp.ac.ir/article-1-535-en.html
Assistant Professor University of Guilan
Abstract:   (223 Views)

Question answering system is a field in natural language processing and information retrieval noticed by researchers in these decades. Due to a growing interest in this field of research, the need to have appropriate data sources is perceived. Most researches about developing question answering corpus area have been done in English so far, but in other languages as Persian, the lack of these corpora is perceived. In this article, the development of a Persian question answering corpus called Rasayel&massayel will be discussed. This corpus consists of 2,118 non-factoid and 2,051 factoid questions that for each question, question text, question type, question difficulty from questioner and responder’s perspective, expected answer type in coarse-grained and fine-grained level, exact answer, and page and paraghraph number of answer are annotated. The prposed corpus can be applied to learn components of question answering system, including question classification, information retrieval, and answer extraction. This corpus is freely available for the academic purpose as well. In the following, a question answering system is presented on the Rasayel&massayel corpus. Our experimental result represents that the intended proposed system has achieved 82.29 % accuracy and 56.73 % mean reciprocal rank. It could be also claimed that this is the first ever question answering system and corpus with such features in Persian.
 

Full-Text [PDF 5230 kb]   (86 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2016/06/19 | Accepted: 2018/01/30 | Published: 2018/06/13 | ePublished: 2018/06/13

References
1. [1] (Hosseini, P. et al., "Persian Sentiment Analysis Corpus: Developing a textual sentiment corpus for Persia". Third conference on Computational Linguistics, Sharif University of Technology, 2014.)
2. [2] (Ghaemi, .H, kahani, .M. "Question Classification using nsemble Classifiers ", JSDP, 13 (3), 2016, 99-112) [DOI:10.18869/acadpub.jsdp.13.3.99]
3. [3] AleAhmad, A. et al., "Hamshahri: A standard Persian text collection", Knowledge-Based System-s, 22(5), 2009, pp.382–387. [DOI:10.1016/j.knosys.2009.05.002]
4. [4] Bijankhan, M. et al., "Lessons from building a Per-sian written corpus: Peykare". Language Resourc-es and Evaluation, 45(2), 2010, pp.143–164. [DOI:10.1007/s10579-010-9132-x]
5. [5] Greenwood, M.A., "Open-domain question answering", Foundations and Trends in Informa-tion Retrieval, 2005.
6. [6] Gupta, P. and Gupta, V.,"A survey of text question answering techniques", International Journal of Computer Applications, 53(4), 2012, pp.1–8. [DOI:10.4018/jaec.2012100101]
7. [7] Hirschman, L. and Gaizauskas, R., "Natural language question answering: the view from here", Natural Language Engineering, 7(04), 2000, pp.275–300.
8. [8] Kolomiyets, O. and Moens, M.-F., "A survey on question answering technology from an informa-tion retrieval perspective", Information Sciences, 181(24), 2011, pp.5412–5434. [DOI:10.1016/j.ins.2011.07.047]
9. [9] Lee, G. et al., "SiteQ: Engineering High Performance QA System Using", Lexico-Semantic Pattern Matching and Shallow NLP. In TREC, 2001.
10. [10] Li, X. and Roth, D., "Learning question classifiers", In Proceedings of the 19th interna-tional conference on Computational linguistics., 2002. [DOI:10.3115/1072228.1072378]
11. [11] Li, X. and Roth, D., "Learning question classifiers: the role of semantic information", Natural Language Engineering, 12(03), 2006, pp.229–249. [DOI:10.1017/S1351324905003955]
12. [12] Magnini, B. et al., "Creating the DISEQuA corpus: a test set for multilingual question answering", In Comparative Evaluation of Multilingual Information Access Systems, 2004, pp. 487–500.
13. [13] Manning, C.D., Raghavan, P. and Schütze, H., "Introduction to information retrieval", Cambr-idge university press Cambridge , 2008.
14. [14] Moghaddas, B.B. et al, "Pasokh: A standard corpus for the evaluation of Persian text summarizers". In Computer and Knowledge Engineering (ICCKE), 2013, pp. 471–475. [DOI:10.1109/ICCKE.2013.6682873]
15. [15] Moll, D. and Vicedo, L., "Question Answering in Restricted Domains : An Overview", Computa-tional Linguistics, 2007. [DOI:10.1162/coli.2007.33.1.41]
16. [16] Mollaei, A., Rahati-Quchani, S. and Estaji, A., "Question classification in Persian language based on conditional random fields", 2nd International eConference on Computer and Knowledge Engineering (ICCKE), 2012, pp.295–300.
17. [17] Rasooli, M.S et al., "Development of a Persian syntactic dependency treebank". In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013, pp. 306–314.
18. [18] Smith, N.A., Heilman, M. and Hwa, R., "Question generation as a competitive undergraduate course project", In Proceedings of the NSF Workshop on the Question Generation Shared Task and Evaluation Challenge, 2008, pp. 4–6.
19. [19] Tellex, S. et al., "Quantitative evaluation of passage retrieval algorithms for question answering", In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, 2003, pp. 41–47. [DOI:10.1145/860435.860445]
20. [20] Tom, D. et al., "TrainQA: a Training Corpus for Corpus-Based Question Answering Systems", In Proc. 8th Int. Conf. on Computational Linguistics and Intelligent Text Processing, 2007, pp. 1–7.
21. [21] Voorhees, E.M., "Building a question answering test collection", ACM SIGIR, 2000. [DOI:10.1145/345508.345577]
22. [22] Voorhees, E.M., "The TREC-8 Question Answering Track Report", In TREC, 1999, pp. 77–82.
23. [23]Yaghoobzadeh, Y. et al., "ISO-TimeML Event Extraction in Persian Text",COLING, 2012, pp.2931-2944.
24. [24]Zhang, D. and Lee, W.S., "Question classification using support vector machines", In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, 2003, pp. 26–32. [DOI:10.1145/860435.860443]

Add your comments about this article : Your username or Email:
CAPTCHA code

Send email to the article author


© 2015 All Rights Reserved | Signal and Data Processing