Volume 15, Issue 2 (9-2018)                   JSDP 2018, 15(2): 89-102 | Back to browse issues page

XML Persian Abstract Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Najafzadeh M, Rahati Quchan S, Ghaemi R. A Semi-supervised Framework Based on Self-constructed Adaptive Lexicon for Persian Sentiment Analysis. JSDP. 2018; 15 (2) :89-102
URL: http://jsdp.rcisp.ac.ir/article-1-644-en.html
Mashhad Branch, Islamic Azad University
Abstract:   (937 Views)
With the appearance of Web 2.0 and 3.0, users’ contribution to WWW has created a huge amount of valuable expressed opinions. Considering the difficulty or impossibility of manually analyzing such big data, sentiment analysis, as a branch of natural language processing, has been highly considered. Despite the other (popular) languages, a limited number of research studies have been conducted in Persian sentiment analysis. In this study, for the first time, a semi-supervised framework is proposed for Persian sentiment analysis. Moreover, considering that one of the most recent studies in Persian, is an algorithm based on extracting adaptive (dataset-sensitive) expert-based emotional patterns. In this research, extraction of the same state-of-the-art emotional patterns is proposed to be performed automatically. Moreover, application of the HMM classifier, by utilizing the mentioned features (as its states) is analyzed; and additionally, HMM-based sentiment analysis is upgraded by being combined with a rule-based classifier for the opinion assignment process. In addition, toward intelligent self-training, a criterion for evaluating, the high reliability of output is presented by which (assuming satisfaction of the criterion) the self-training process is performed in “lexicon-extraction” and “classifier,” as learning systems. The proposed method, by being applied on the basis dataset, provides 90% of accuracy (despite its expert-independent lexicon generation nature), which in comparison with the supervised and semi-supervised methods in the state-of-the-art has a considerable superiority. Moreover, this semi-supervised method is evaluated by a 10/90 ratio of train/ test and its reliability is demonstrated by providing 80% of accuracy.
Full-Text [PDF 4317 kb]   (643 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2017/04/10 | Accepted: 2017/10/25 | Published: 2018/09/16 | ePublished: 2018/09/16

1. [1] S.M. Asghari N, M. Kahani, and E. Askarian, "Opinion Mining by means of syntactic and semantic labels, and discovering emotional relations in Persian sentences". Computer Society of Iran (20th) Computer Conference (CSICC 2015). Ferdowsi University of Mashhad. 2015
2. [2] S. Borhani Z., A.A. Niknafs, and M. Mohammadi, "Opinion mining in product reviews, using emotional vocabulary network" 2nd National Conference on Indurtrial Engineering & Systems (NIESC 2014). Najafabad branch, Islamic Azad University. 2014
3. [3] H. Sotudeh and Z. Honarjooyan, "A review on Persian challenges in digital paradigms, and their effect on efficiency of automatic text processing and information retrieval," Library and Informa-tion Science, 15 (4), Astan Quds Razavi. 2013
4. [4] S. Alimardani and A. Aghaei, "Opinion Mining in Persian Language Using Supervised Algorithms," 2015.
5. [5] A. Azimizadeh, M. M. Arab, and S. R. Quchani, "Persian part of speech tagger based on Hidden Markov Model," 9th JADT, 2008.
6. [6] M. E. Basiri, A. R. Naghsh-Nilchi, and N. Ghassem-Aghaee, "A Framework for Sentiment Analysis in Persian," 2014.
7. [7] J. Bollen, H. Mao, and X. Zeng, "Twitter mood predicts the stock market," J. Comput. Sci., vol. 2, no. 1, pp. 1–8, 2011. [DOI:10.1016/j.jocs.2010.12.007]
8. [8] I. Dehdarbehbahani, A. Shakery, and H. Faili, "Semi-supervised word polarity identification in resource-lean languages," Neural Networks, vol. 58, pp. 50-59, 2014. [DOI:10.1016/j.neunet.2014.05.018] [PMID]
9. [9] M. Gamon, "Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis," in Proceedings of the 20th international conference on Computa-tional Linguistics, 2004, p. 841.
10. [10] V. Gupta and G. S. Lehal, "A survey of text mining techniques and applications," J. Emerg. Technol. web Intell., vol. 1, no. 1, pp. 60–76, 2009. [DOI:10.4304/jetwi.1.1.60-76]
11. [11] A. K. Jain and Y. Pandey, "Analysis and implementation of sentiment classification using lexical POS markers," Int. J., vol. 2, no. 1, 2013.
12. [12] B. Liu, "Sentiment analysis and opinion mining," Synth. Lect. Hum. Lang. Technol., vol. 5, no. 1, pp. 1–167, 2012. [DOI:10.2200/S00416ED1V01Y201204HLT016]
13. [13] B. Liu, "Sentiment analysis: Mining opinions, sentiments, and emotions": Cambridge Univer-sity Press, 2015. [DOI:10.1017/CBO9781139084789]
14. [14] J. Liu, Y. Cao, C.-Y. Lin, Y. Huang, and M. Zhou, "Low-Quality Product Review Detection in Opinion Summarization," in EMNLP-CoNLL, 2007, pp. 334–342.
15. [15] R. Kumar and R. Vadlamani, "A survey on opinion mining and sentiment analysis: tasks, approaches and applications," Knowledge-Based Syst., vol. 89, pp. 14–46, 2015. [DOI:10.1016/j.knosys.2015.06.015]
16. [16] E. Sadikov, A. Parameswaran, and P. Venetis, "Blogs as predictors of movie success," 2009
17. [17] M. Saraee and A. Bagheri, "Feature selection methods in Persian sentiment analysis," in Natural Language Processing and Information Systems, Springer, 2013, pp. 303–308. [DOI:10.1007/978-3-642-38824-8_29]
18. [18] M. Shams, A. Shakery, and H. Faili, "A non-parametric LDA-based induction method for sentiment analysis," in Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on, 2012, pp. 216–221.
19. [19] S. M. Thede and M. P. Harper, "A second-order hidden Markov model for part-of-speech tagging," in Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, 1999, pp. 175–182 [DOI:10.3115/1034678.1034712]
20. [20] A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment," ICWSM, vol. 10, pp. 178–185, 2010.
21. [21] H. Scudder, "Probability of error of some adaptive pattern-recognition machines," IEEE Transactions on Information Theory, vol. 11, pp. 363-371, 1965. [DOI:10.1109/TIT.1965.1053799]
22. [22] N. F. F. da Silva, L. F. Coletta, E. R. Hruschka, and E. R. Hruschka Jr, "Using unsupervised information to improve semi-supervised tweet sentiment classification," Information Sciences, vol. 355, pp. 348-365, 2016. [DOI:10.1016/j.ins.2016.02.002]
23. [23] L. R. Welch, "Hidden Markov models and the Baum-Welch algorithm," IEEE Information Theory Society Newsletter, vol. 53, pp. 10-13, 2003.
24. [24] M. Kang, J. Ahn, and K. Lee, "Opinion mining using ensemble text hidden Markov models for text classification." 2017.
25. [25] N. F. F. D. Silva, L. F.Coletta, & E. R.Hruschka, "A survey and comparative study of tweet sentiment analysis via semi-supervised learning." ACM Computing Surveys (CSUR), 49(1), 15. 2016 [DOI:10.1145/2932708]
26. [26] D. Rao, and D. Ravichandran, "Semi-supervised polarity lexicon induction" in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (pp. 675-682). Association for Computational Linguistics. 2009.
27. [27] L. Becker, G. Erhart, D. Skiba, and V. Matula, "AVAYA: Sentiment Analysis on Twitter with Self-Training and Polarity Lexicon Expansion". SemEval@ NAACL-HLT, pp. 333-340, 2013.
28. [28] S.Liu, F.Li, F.Li, X.Cheng, & H.Shen, "Adaptive co-training SVM for sentiment classification on tweets". In Proceedings of the 22nd International Conference on World Wide Web Information & Knowledge Management (pp. 2079-2088). ACM. 2013. [PMCID]
29. [29] S.Liu, W.Zhu, N.Xu, F.Li, X. Q.Cheng, Y.Liu, & Y.Wang, "Co-training and visualizing sentim-ent evolvement for tweet events". In Proceedings of the 22nd International Confer-ence on World Wide Web (pp. 105-106). ACM. 2013

Add your comments about this article : Your username or Email:

Send email to the article author

© 2015 All Rights Reserved | Signal and Data Processing