A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

Jahanbakhsh-Nagadeh, Zoleikha; Feizi-Derakhshi, Mohammad-Reza; Sharifi, Arash

doi:10.52547/jsdp.18.1.50

Signal and Data Processing Journal A scientific journal officially licensed by the Commission for Scientific Publications of the (MSRT). Publisher: Research Ceter for Developmen of Technologies

EN FA

Volume 18, Issue 1 (5-2021) JSDP 2021, 18(1): 50-29 | Back to browse issues page

‎ 10.52547/jsdp.18.1.50

Mendeley

Zotero

RefWorks

Jahanbakhsh-Nagadeh Z, Feizi-Derakhshi M, Sharifi A. A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks. JSDP 2021; 18 (1) :50-29
URL: http://jsdp.rcisp.ac.ir/article-1-1033-en.html

A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

Zoleikha Jahanbakhsh-Nagadeh

, Mohammad-Reza Feizi-Derakhshi ^*

, Arash Sharifi

Department of Computer Engineering University of Tabriz, Tabriz, Iran.

Abstract: (5876 Views)

The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in the English language, but more limited work has been done in the Persian language to detect rumors. This study analyzed the content of the original rumor and introduced informative content features to early identify Persian rumors (i.e., when it is published on news media but has not yet spread on social media) on Twitter and Telegram. Therefore, the proposed model is based on physical and non-physical content features in three categories including, lexical, syntactic, and pragmatic. These features are a combination of the common content features along with the proposed new content-based features. Since no social context information is available at the time of posting rumors, the proposed model is independent of propagation-based features and relies on the content-based information of the original rumor. Although in the proposed model, much information (including user information, the user's reaction to the rumor, and propagation structures) are ignored, but helpful content information can be obtained for classification by content analysis of the original rumor.
Several experiments have been performed on the various combinations of feature sets (i.e., common and proposed content features) to explore the capability of features in distinguishing rumors and non-rumors separately and jointly. To this end, three machine learning algorithms including, Random Forest (RF), AdaBoost, and Support Vector Machine (SVM) have been used as strong classifications to evaluate the accuracy of the proposed model. To achieve the best performance of classification algorithms on the training dataset, it is necessary to use feature selection techniques. In this study, the Sequential Forward Floating Search (SFFS) approach has been used to select valuable features. Also, the statistical results of the t-test on the P-value (<=0.05) demonstrate that most of the new features proposed in this study reveal statistically significant differences between rumor and non-rumor documents. The experimental results are shown the performance of new proposed features to improve the accuracy of the rumor detection. The F-measure of the proposed model to detect Persian rumors on the Twitter dataset was 0.848, on the Kermanshah earthquake dataset was 0.952 and on the Telegram dataset was 0.867, which indicated the ability of the proposed method to identify rumors only by focusing on the content features of the original rumor text. The results of evaluating the proposed model on Twitter rumors show that, despite the short length of Twitter tweets and the extraction of limited content information from tweets, the proposed model can detect Twitter rumors with acceptable accuracy. Hence, the ability of content features to distinguish rumors from non-rumors is proven.

Keywords: Persian rumors detection, Content analysis, Physical and non-physical content features, Text processing

Full-Text [PDF 1329 kb] (2708 Downloads)

Type of Study: Research | Subject: Paper
Received: 2019/06/13 | Accepted: 2020/09/23 | Published: 2021/05/22 | ePublished: 2021/05/22

References

1. [1] A. Zubiaga et al., "Detection and resolution of rumours in social media: A survey," ACM Comput. Surv., vol. 51, no. 2, p. 32, 2018. [DOI:10.1145/3161603]

2. [2] C. Castillo, M. Mendoza, and B. Poblete, "Information credibility on twitter," in Proceedings of the 20th international conference on World wide web - WWW '11, 2011,pp. 675. [DOI:10.1145/1963405.1963500]

3. [3] V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei, "Rumor has it: Identifying misinformation in microblogs," in Proceedings of the conference on empirical methods in natural language processing, 2011, pp. 1589-1599.

4. [4] R. Dayani, N. Chhabra, T. Kadian, and R. Kaushal, "Rumor: Detecting Misinformation in Twitter," in 3rd Security and Privacy Symposium, 2015.

5. [5] S. Kwon, M. Cha, K. Jung, W. Chen, and Y. Wang, "Prominent Features of Rumor Propagation in Online Social Media," in 2013 IEEE 13th International Conference on Data Mining, 2013, pp. 1103-1108. [DOI:10.1109/ICDM.2013.61] [PMID] [PMCID]

6. [6] S. Hamidian and M. T. Diab, "Rumor Detection and Classification for Twitter Data," in the Fifth International Conference on Social Media Technologies, Communication, and Informatics, 2015.

7. [7] G. Giasemidis et al., "Determining the veracity of rumours on Twitter," in International Conference on Social Informatics, 2016, pp. 185-205. [DOI:10.1007/978-3-319-47880-7_12]

8. [8] K. Wu, S. Yang, and K. Q. Zhu, "False rumors detection on Sina Weibo by propagation structures," in 2015 IEEE 31st International Conference on Data Engineering, 2015, vol. 2015-May, pp. 651-662. [DOI:10.1109/ICDE.2015.7113322] [PMCID]

9. [9] S. Hamidian and M. Diab, "Rumor Identification and Belief Investigation on Twitter," in Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2016, pp. 3-8. [DOI:10.18653/v1/W16-0403]

10. [10] Z. Zhao, P. Resnick, and Q. Mei, "Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts," in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 1395-1405. [DOI:10.1145/2736277.2741637] [PMCID]

11. [11] E. Rahim zade, S. Soltanipour, "Analysis of the content of denied news and rumors in the print and online media of Iran on the eve of the 10th term of the Islamic Consultative Assembly," in the second national conference on media, communications and education Citizenship, Tehran, 2017, (In Persian).

12. [12] S. Zamani, M. Asadpour, and D. Moazzami, "Rumor detection for Persian Tweets," in 2017 Iranian Conference on Electrical Engineering (ICEE), 2017, pp. 1532-1536. [DOI:10.1109/IranianCEE.2017.7985287]

13. [13] M. Seifikar, S. Farzi, and S. D. Mahmoodabad, "Kermanshah Earthquake Event Tracking Through Persian Tweets," in 2018 9th International Symposium on Telecommunications (IST), 2018, pp. 424-428. [DOI:10.1109/ISTEL.2018.8661059]

14. [14] A. Y. K. Chua and S. Banerjee, "Linguistic predictors of rumor veracity on the Internet," in Lecture Notes in Engineering and Computer Science, 2016, vol. 1, pp. 387-391.

15. [15] Q. Li, Q. Zhang, and L. Si, "eventai at semeval-2019 task 7: Rumor detection on social media by exploiting content, user credibility and propagation information," in Proceedings of the 13th International Workshop on Semantic Evaluation, 2019, pp. 855-859. [DOI:10.18653/v1/S19-2148]

16. [16] F. Xing and C. Guo, "Mining Semantic Information in Rumor Detection via a Deep Visual Perception Based Recurrent Neural Networks," in 2019 IEEE International Congress on Big Data (BigDataCongress), 2019, pp. 17-23. [DOI:10.1109/BigDataCongress.2019.00016] [PMCID]

17. [17] S. Vosoughi and D. Roy, "A human-machine collaborative system for identifying rumors on twitter," in 2015 IEEE International Conference on Data Mining Workshop (ICDMW), 2015, pp. 47-50. [DOI:10.1109/ICDMW.2015.221]

18. [18] A. Y. K. Chua and S. Banerjee, "Linguistic predictors of rumor veracity on the Internet," pp. 387-391, 2016.

19. [19] A. Zubiaga, M. Liakata, and R. Procter, "Exploiting context for rumour detection in social media," in International Conference on Social Informatics, 2017, pp. 109-123. [DOI:10.1007/978-3-319-67217-5_8]

20. [20] S. Mahmoodabad, … S. F.-2018 9th I., and U. 2018, "Persian Rumor Detection on Twitter," ieeexplore.ieee.org. [DOI:10.1109/ISTEL.2018.8661007]

21. [21] H. K. Thakur, A. Gupta, A. Bhardwaj, and D. Verma, "Rumor Detection on Twitter Using a Supervised Machine Learning Framework," Int. J. Inf. Retr. Res., vol. 8, no. 3, pp. 1-13, Jul. 2018. [DOI:10.4018/IJIRR.2018070101]

22. [22] S. Vosoughi, D. Roy, and S. Aral, "The spread of true and false news online," Science (80-. )., vol. 359, no. 6380, pp. 1146-1151, 2018. [DOI:10.1126/science.aap9559] [PMID]

23. [23] A. Bondielli and F. Marcelloni, "A survey on fake news and rumour detection techniques," Inf. Sci. (Ny)., vol. 497, pp. 38-55, 2019. [DOI:10.1016/j.ins.2019.05.035]

24. [24] G. W. Allport and L. Postman, The psychology of rumor. Henry Holt, 1947.

25. [25] H. Mohammadi and S. H. Khasteh, "A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset," Oct. 2018.

26. [26] M. M. Homayounpour and A. S. Panah, "Speech Acts Classification of Persian Language Texts Using Three Machine Leaming Methods," Int. J. Inf. Commun. Technol. Res., vol. 2, no. 1, pp. 65-71, 2010.

27. [27] Z. Jahanbakhsh-Nagadeh, M.-R. Feizi-Derakhshi, and A. Sharifi, "A Speech Act Classifier for Persian Texts and its Application in Identifying Rumors," J. Soft Comput. Inf. Technol. (JSCIT) Vol, vol. 9, no. 1, 2020.

28. [28] S. M. Mohammad and P. D. Turney, "Crowdsourcing a word-emotion association lexicon," in Computational Intelligence, 2013, vol. 29, no. 3, pp. 436-465. [DOI:10.1111/j.1467-8640.2012.00460.x]

29. [29] H. Moradi, F. Ahmadi, and M.-R. Feizi-Derakhshi, "A Hybrid Approach for Persian Named Entity Recognition," Iran. J. Sci. Technol. Trans. A Sci., vol. 41, no. 1, pp. 215-222, 2017. [DOI:10.1007/s40995-017-0209-x]

30. [30] V. Korde and C. N. Mahender, "Text classification and classifiers: A survey," Int. J. Artif. Intell. Appl., vol. 3, no. 2, p. 85, 2012. [DOI:10.5121/ijaia.2012.3208]

31. [31] A.-R. Feizi-Derakhshi et al., "Sepehr_RumTel01," 2019.

32. [32] H. Jafary, M.-T. Taghavifard, P. Hanafizadeh, and A. Kazazi, "Native Quality Assessment Model of news sites (NEWSQUAL)," J. Soft Comput. Inf. Technol., vol. 7, no. 1, pp. 56-71, 2018, (In Persian).

33. [33] M. Wainberg, B. Alipanahi, and B. J. Frey, "Are random forests truly the best classifiers?," J. Mach. Learn. Res., vol. 17, no. 1, pp. 3837-3841, 2016. [DOI:10.1186/s12864-016-3121-4] [PMID] [PMCID]

34. [34] J. Wainer, "Comparison of 14 different families of classification algorithms on 115 binary datasets," arXiv Prepr. arXiv1606.00930, 2016.

35. [35] A. J. Wyner, M. Olson, J. Bleich, and D. Mease, "Explaining the success of adaboost and random forests as interpolating classifiers," J. Mach. Learn. Res., vol. 18, no. 1, pp. 1558-1590, 2017.

36. [36] L. Breiman, "Random forests," Mach. Learn., vol. 45, no. 1, pp. 5-32, Oct. 2001. [DOI:10.1023/A:1010933404324]

37. [37] Y. Freund and R. E. Schapire, "A desicion-theoretic generalization of on-line learning and an application to boosting," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 904, 1995, pp. 23-37. [DOI:10.1007/3-540-59119-2_166]

38. [38] C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273-297, Sep. 1995. [DOI:10.1007/BF00994018]

39. [39] N. El Aboudi and L. Benhlima, "Review on wrapper feature selection approaches," in Proceedings - 2016 International Conference on Engineering and MIS, ICEMIS 2016, 2016. [DOI:10.1109/ICEMIS.2016.7745366]

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.