Volume 15, Issue 1 (6-2018)                   JSDP 2018, 15(1): 71-86 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Asgarian E, Kahani M, Sharifi S. HesNegar: Persian Sentiment WordNet. JSDP 2018; 15 (1) :71-86
URL: http://jsdp.rcisp.ac.ir/article-1-554-en.html
Ferdowsi University of Mashhad
Abstract:   (6195 Views)

Awareness of others' opinions plays a crucial role in the decision making process performed by simple customers to top-level executives of manufacturing companies and various organizations. Today, with the advent of Web 2.0 and the expansion of social networks, a vast number of texts related to people's opinions have been created. However, exploring the enormous amount of documents, various opinion sources and opposing opinions about an entity have made the process of extracting and analyzing opinions very difficult. Hence, there is a need for methods to explore and summarize the existing opinions. Accordingly, there has recently been a new trend in natural language processing science called "opinion mining". The main purpose of opinion mining is to extract and detect people’s positive or negative sentiments (sense of satisfaction) from text reviews. The absence of a comprehensive Persian sentiment lexicon is one of the main challenges of opinion mining in Persian.
In this paper, a new methodology for developing Persian Sentiment WordNet (HesNegar) is presented using various Persian and English resources. A corpus of Persian reviews developed for opinion mining studies are introduced. To develop HesNegar, a comprehensive Persian WordNet (FerdowsNet), with high recall and proper precision (based on Princeton WordNet), was first created. Then, the polarity of each synset in English SentiWordNet is mapped to the corresponding words in HesNegar. In the conducted tests, it was found that HesNegar has a precision score of 0.86 a recall score of 0.75 and it can be used as a comprehensive Persian SentiWordNet. The findings and developments made in this study could prove useful in the advancement of opinion mining research in Persian and other similar languages, such as Urdu and Arabic.
 

Full-Text [PDF 6738 kb]   (5644 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2017/02/12 | Accepted: 2016/10/24 | Published: 2018/06/13 | ePublished: 2018/06/13

References
1. [1] P. D. Turney, "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews," in Proceedings of the 40th annual meeting on association for computational linguistics, 2002, pp. 417-424.
2. [2] B. Pang and L. Lee, "Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales," presented at the Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005. [DOI:10.3115/1219840.1219855]
3. [3] E. Riloff and J. Wiebe, "Learning extraction patterns for subjective expressions," presented at the Proceedings of the 2003 conference on Empirical methods in natural language process-ing, 2003. [DOI:10.3115/1119355.1119369]
4. [4] S.-M. Kim and E. Hovy, "Extracting opinions, opinion holders, and topics expressed in online news media text," presented at the Proceedings of the Workshop on Sentiment and Subjectivity in Text, 2006. [DOI:10.3115/1654641.1654642]
5. [5] C. O. Alm, "Subjective natural language problems: motivations, applications, character-izations, and implications," presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, 2011.
6. [6] L. Barbosa and J. Feng, "Robust sentiment detection on twitter from biased and noisy data," presented at the Proceedings of the 23rd International Conference on Computational Lin-guistics: Posters, 2010.
7. [7] M. Abdul-Mageed, M. Diab, and M. Korayem, "Subjectivity and sentiment analysis of modern standard Arabic," presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011.
8. [8] I. Habernal, T. Ptáček, and J. Steinberger, "Supervised sentiment analysis in Czech social media," Information Processing & Management, vol. 50, pp. 693-707, 2014. [DOI:10.1016/j.ipm.2014.05.001]
9. [9] H. Guo, H. Zhu, Z. Guo, X. Zhang, and Z. Su, "OpinionIt: a text mining system for cross-lingual opinion analysis," in Proceedings of the 19th ACM international conference on Informa-tion and knowledge management, 2010, pp. 1199-1208. [DOI:10.1145/1871437.1871589]
10. [10] D. Gao, F. Wei, W. Li, X. Liu, and M. Zhou, "Cross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation," Computational Linguistics, vol. 41, pp. 21-40, 2015. [DOI:10.1162/COLI_a_00207]
11. [11] M.-T. Martín-Valdivia, E. Martínez-Cámara, J.-M. Perea-Ortega, and L. Alfonso Ure-a-López, "Sentiment polarity detection in Spanish reviews combining supervised and unsupervised appro-aches," Expert Systems with Applications, vol. 40, pp. 3934-3942, 2012. [DOI:10.1016/j.eswa.2012.12.084]
12. [12] C. Banea, R. Mihalcea, and J. Wiebe, "Porting Multilingual Subjectivity Resources Across Languages," IEEE Transactions on Affective Computing, vol. 4, 2013.
13. [13] A. Balahur and M. Turchi, "Comparative Experiments Using Supervised Learning and Machine Translation for Multilingual Sentiment Analysis," Computer Speech & Language, vol. 28, pp. 56–75, 2013. [DOI:10.1016/j.csl.2013.03.004]
14. [14] M. Okada and K. Hashimoto, "Investigation of Preprocessing of Multilingual Online Reviews for Automatic Classification," in Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on, 2012, pp. 306-309. [DOI:10.1109/ICIS.2012.64]
15. [15] X. Ding, B. Liu, and P. S. Yu, "A holistic lexicon-based approach to opinion mining," in Proceed-ings of the international conference on Web search and web data mining, 2008, pp. 231-240.
16. [16] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, "Lexicon-based methods for sentiment analysis," Computational linguistics, vol. 37, pp. 267-307, 2011. [DOI:10.1162/COLI_a_00049]
17. [17] J. Kamps, M. Marx, R. J. Mokken, and M. De Rijke, "Using wordnet to measure semantic orientations of adjectives," in Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 2004, pp. 1115-1118.
18. [18] A. Fahrni and M. Klenner, "Old wine or warm beer: Target-specific sentiment analysis of adjectives," in Proc. of the Symposium on Affective Language in Human and Machine, AISB, 2008, pp. 60-63.
19. [19] V. Hatzivassiloglou and K. R. McKeown, "Predicting the semantic orientation of adjec-tives," in Proceedings of the eighth confer-ence on European chapter of the Associa-tion for Computational Linguistics, 1997, pp. 174-181. [DOI:10.3115/979617.979640]
20. [20] N. Kaji and M. Kitsuregawa, "Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents," in EMNLP-CoNLL, 2007, pp. 1075-1083.
21. [21] L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald, "The viability of web-derived polarity lexicons," in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, pp. 777-785. [PMID]
22. [22] H. Takamura, T. Inui, and M. Okumura, "Extracting semantic orientations of words using spin model," in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005, pp. 133-140. [DOI:10.3115/1219840.1219857]
23. [23] A. Esuli and F. Sebastiani, "Pageranking wordnet synsets: An application to opinion mining," presented at the Proceedings of the 43rd Annual Meeting on Association for Computational Lin-guistics (ACL), Prague, Czech Republic, 2007.
24. [24] D. Rao and D. Ravichandran, "Semi-supervised polarity lexicon induction," in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009, pp. 675-682. [DOI:10.3115/1609067.1609142]
25. [25] S. Poria, A. Gelbukh, A. Hussain, N. Howard, D. Das, and S. Bandyopadhyay, "Enhanced Sentic-Net with affective labels for concept-based opinion mining," IEEE Intelligent Systems, vol. 28, pp. 31-38, 2013. [DOI:10.1109/MIS.2013.4]
26. [26] S. Gindl, A. Weichselbraun, and A. Scharl, "Extracting and Grounding Contextualized Sentiment Lexicons," 2013.
27. [27] D. Tang, F. Wei, B. Qin, M. Zhou, and T. Liu, "Building Large-Scale Twitter-Specific Senti-ment Lexicon: A Representation Learning Appr-oach," in the 25th International Conference on Computational Linguistics (COLING), 2014, pp. 172-182.
28. [28] S. Nofersti and M. Shamsfard, "Automatic building a corpus and exploiting it for polarity classification of indirect opinions about drugs.", in Journal of Signal and Data Processing (JSDP), 2016; 13 (2), pp.35-49.
29. [29] H. Kanayama and T. Nasukawa, "Fully automatic lexicon expansion for domain-oriented sentiment analysis," in Proceedings of the 2006 conference on empirical methods in natural language processing, 2006, pp. 355-363. [DOI:10.3115/1610075.1610125]
30. [30] A. Hassan and D. Radev, "Identifying text polarity using random walks," in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010, pp. 395-403. [PMID] [PMCID]
31. [31] A. Hassan, A. Abu-Jbara, R. Jha, and D. Radev, "Identifying the semantic orientation of foreign words," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, 2011, pp. 592-597.
32. [32] I. Dehdarbehbahani, A. Shakery, and H. Faili, "Semi-supervised word polarity identification in resource-lean languages," Neural Networks, vol. 58, pp. 50-59, 2014. [DOI:10.1016/j.neunet.2014.05.018] [PMID]
33. [33] S. Baccianella, A. Esuli, and F. Sebastiani, "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining," in LREC, 2010, pp. 2200-2204.
34. [34] A. Esuli and F. Sebastiani, "Sentiwordnet: A publicly available lexical resource for opinion mining," in Proceedings of 5th International Conference on Language Resources and Evaluation (LREC), Genoa, 2006, pp. 417-422.
35. [35] A. Neviarouskaya, H. Prendinger, and M. Ishizuka, "SentiFul: A lexicon for sentiment ana-lysis," IEEE Transactions on Affective Com-puting, vol. 2, pp. 22-36, 2011. [DOI:10.1109/T-AFFC.2011.1]
36. [36] A. Neviarouskaya, H. Prendinger, and M. Ishizuka, "Textual affect sensing for sociable and expressive online communication," in Interna-tional Conference on Affective Computing and Intelligent Interaction, 2007, pp. 218-229. [DOI:10.1007/978-3-540-74889-2_20]
37. [37] C. Strapparava and A. Valitutti, "WordNet Affect: an Affective Extension of WordNet," in Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), 2004, pp. 1083-1086.
38. [38]] E. Cambria, R. Speer, C. Havasi, and A. Hussain, "SenticNet: A Publicly Available Semantic Resource for Opinion Mining," in AAAI fall symposium: commonsense knowledge, 2010.
39. [39] M. E. Basiri, A. R. Naghsh-Nilchi, and N. Ghassem-Aghaee, "A Framework for Sentiment Analysis in Persian," Open Transactions on Information Processing, vol. 1, pp. 1-14, 2014. [DOI:10.15764/OTIP.2014.03001]
40. [40] F. Amiri, S. Scerri, and M. H. Khodashahi, "Lexicon-based Sentiment Analysis for Persian Text," in Recent Advances in Natural Language Processing, 2015, pp. 9-16.
41. [41] M. Shams, A. Shakery, and H. Faili, "A non-parametric LDA-based induction method for sentiment analysis," in Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on, 2012, pp. 216-221. [DOI:10.1109/AISP.2012.6313747]
42. [42] A.Mardani and S.A.Aghaie "A superviesd method for opinion mining in Persian using lexicon and SVM algorithm", in National Journal of Information Technology Management, 2015(7), pp. 345-362.
43. [43] S. Cerini, V. Compagnoni, A. Demontis, M. Formentelli, and G. Gandini, "Micro-WNOp: A gold standard for the evaluation of automatically compiled lexical resources for opinion mining," Language resources and linguistic theory: Typo-logy, second language acquisition, English ling-uistics, pp. 200-210, 2007. [PMCID]
44. [44] A. Montejo-Ráez, E. Martínez-Cámara, M. T. Martín-Valdivia, and L. A. Ure-a-López, "Ranked wordnet graph for sentiment polarity classification in twitter," Computer Speech & Language, vol. 28, pp. 93-107, 2014. [DOI:10.1016/j.csl.2013.04.001]
45. [45] M. Montazery and H. Faili, "Automatic Persian wordnet construction," in Proceedings of the 23rd International Conference on Computational Linguistics: Posters, 2010, pp. 846-850.
46. [46] K. N. Lam, F. A. Tarouti, and J. Kalita, "Automatically constructing Wordnet synsets," in 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Ba-ltimore, USA, 2014. [DOI:10.3115/v1/P14-2018]
47. [47] M. Shamsfard, A. Hesabi, H. Fadaei, N. Mansoory, A. Famian, S. Bagherbeigi, et al., "Semi automatic development of farsnet; the persian wordnet," in Proceedings of 5th Global WordNet Conference, Mumbai, India, 2010.
48. [48] P. Vossen, "A multilingual database with lexical semantic networks," Computational Linguistics vol. 25, pp. 628-630, 1998. [DOI:10.1007/978-94-017-1491-4]
49. [49] F. Keyvan, H. Borjian, M. Kasheff, and C. Fellbaum, "Developing persianet: The persian wordnet," in 3rd Global wordnet conference, 2007, pp. 315-318.
50. [50] A. Famian and D. Aghajaney, "Towards Building a WordNet for Persian Adjectives," International Journal of lexicography, pp. 307-308, 2006.
51. [51] M. Fadaee, H. Ghader, H. Faili, and A. Shakery, "Automatic WordNet Construction Using Markov Chain Monte Carlo," Polibits, pp. 13-22, 2013.
52. [52] N. Taghizadeh and H. Faili, "Automatic Wordnet Development for Low-resource Languages using Cross-lingual WSD," Journal of Artificial Intelligence Research, vol. 56, pp. 61-87, 2016. [DOI:10.1613/jair.4968]
53. [53] F. Mahdisoltani, J. Biega, and F. Suchanek, "YAGO3: A knowledge base from multilingual Wikipedias," in 7th Biennial Conference on Innovative Data Systems Research, 2014.
54. [54] A. AleAhmad, H. Amiri, E. Darrudi, M. Rahgozar, and F. Oroumchian, "Hamshahri: A standard Persian text collection," Knowledge-Based Systems, vol. 22, pp. 382-387, 2009. [DOI:10.1016/j.knosys.2009.05.002]
55. [55] H. Eghbalzadeh, B. Hosseini, S. Khadivi, and A. Khodabakhsh, "Persica: A Persian corpus for multi-purpose text mining and Natural language processing," in Telecommunications (IST), 2012 Sixth International Symposium on, 2012, pp. 1207-1214. [DOI:10.1109/ISTEL.2012.6483172]
56. [56] A. Balali, A. Rajabi, S. Ghassemi, M. Asadpour, and H. Faili, "Content diffusion prediction in social networks," in 5th Conference on Informa-tion and Knowledge Technology (IKT), 2013, pp. 467-471. [DOI:10.1109/IKT.2013.6620114]
57. [57] P. Turney, "Mining the web for synonyms: PMI-IR versus LSA on TOEFL," in 12th European Conference on Machine Learning (ECML 2001), Freiburg, Germany, 2001, pp. 491-502. [DOI:10.1007/3-540-44795-4_42]
58. [58] K. Denecke, "Using sentiwordnet for multi-lingual sentiment analysis," in Data En-gineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on, 2008, pp. 507-512.
59. [59] C. M. Özsert and A. Özgür, "Word polarity detection using a multilingual approach," in Computational Linguistics and Intelligent Text Processing, ed: Springer, 2013, pp. 75-82. [DOI:10.1007/978-3-642-37256-8_7]
60. [60] J. Steinberger, M. Ebrahim, M. Ehrmann, A. Hurriyetoglu, M. Kabadjov, P. Lenkova, et al., "Creating sentiment dictionaries via triangula-tion," Decision Support Systems, vol. 53, pp. 689-694, 2012. [DOI:10.1016/j.dss.2012.05.029]
61. [61] F. L. Cruz, J. A. Troyano, B. Pontes, and F. J. Ortega, "Building layered, multilingual sentim-ent lexicons at synset and lemma levels," Expert Systems with Applications, vol. 41, pp. 5984-5994, 2014. [DOI:10.1016/j.eswa.2014.04.005]
62. [62] F. H. Mahyoub, M. A. Siddiqui, and M. Y. Dahab, "Building an Arabic Sentiment Lexicon Using Semi-Supervised Learning," Journal of King Saud University-Computer and Information Sciences, vol. 26, pp. 417-424, 2014. [DOI:10.1016/j.jksuci.2014.06.003]
63. [63] Y. Chen and S. Skiena, "Building sentiment lexicons for all major languages," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), 2014, pp. 383-389. [DOI:10.3115/v1/P14-2063]
64. [64] M. Shamsfard, "Challenges and open problems in Persian text processing," Proceedings of LTC, vol. 11, 2011.
65. [65] W. Feely, M. Manshadi, R. Frederking, and L. Levin, "The CMU METAL Farsi NLP App-roach," in Proceedings of the Ninth Interna-tional Conference on Language Resources and Evaluation (LREC'14), 2014, pp. 4052-4055.
66. [66] R. Duwairi and M. El-Orfali, "A study of the effects of preprocessing strategies on sentiment analysis for Arabic text," Journal of Information Science, vol. 40, pp. 501-513, 2014. [DOI:10.1177/0165551514534143]
67. [67] W. Chamlertwat, P. Bhattarakosol, T. Rung-kasiri, and C. Haruechaiyasak, "Discover-ing Consumer Insight from Twitter via Sentiment Analysis," J. UCS, vol. 18, pp. 973-992, 2012.
68. [68] M.-T. Martín-Valdivia, E. Martínez-Cámara, J.-M. Perea-Ortega, and L. A. Ure-a-López, "Sentiment polarity detection in Spanish reviews combining supervised and unsupervised appro-aches," Expert Systems with Applications, vol. 40, pp. 3934-3942, 2013. [DOI:10.1016/j.eswa.2012.12.084]
69. [69] K. Denecke, "Are SentiWordNet scores suited for multi-domain sentiment classification?," present-ed at the Fourth International Conference on Di-gital Information Management, (ICDIM 2009), 2009. [DOI:10.1109/ICDIM.2009.5356764]

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing