دوره 19، شماره 2 - ( 7-1401 )                   جلد 19 شماره 2 صفحات 196-175 | برگشت به فهرست نسخه ها

XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Zare Chahooki M A, khalifeh zadeh Z. A General Investigation on the Combination of Local and Global Feature Selection Methods for Request Identification on Telegram. JSDP 2022; 19 (2) :175-196
URL: http://jsdp.rcisp.ac.ir/article-1-1110-fa.html
زارع چاهوکی محمدعلی، خلیفه زاده زهرا. بررسی جامع ترکیب روش‌های محلی ‌و ‌سراسری انتخاب ویژگی برای شناسایی درخواست در تلگرام. پردازش علائم و داده‌ها 1401; 19 (2) :196-175

URL: http://jsdp.rcisp.ac.ir/article-1-1110-fa.html


گروه مهندسی کامپیوتر، دانشکده فنی و مهندسی، دانشگاه یزد
چکیده:   (213 مشاهده)
تلگرام سرویس پیام‌رسان متن‌بازی مبتنی بر رایانش ابری است. تلگرام به دلایلی همچون پشتیبانی از زبان­ها، امکان ایجاد گروه و کانال با تعداد کاربران متعدد، به پیام‌رسانی محبوب و پرکاربرد تبدیل ‌شد. داده‌های متنی زیادی که در گروه‌های تلگرامی وجود دارد حاوی دانش پنهانی هستند. استخراج این دانش‌ها، نظیر درخواست‌های موجود در پیام‌های کاربران می‌تواند سودمند باشد. لذا با شناسایی درخواست‌ها می‌توان به نیازهای کاربران پاسخ داد و به دسترسی سریع آن‌ها به خواسته‌هایشان کمک کرد که این امر موجب توسعه کسب‌وکار کاربران می‌شود. با توجه به ابعاد بالای فضای ویژگی‌ها در داده‌های متنی، کاهش ویژگی‌ها از طریق انتخاب ویژگی ضرورت می­یابد. از روش‌های انتخاب ویژگی، دو روش مبتنی برفیلتر محلی و سراسری انتخاب شد. با بررسی و ترکیب پرکاربردترین آن­ها به زیرمجموعه بهینه­ای از ویژگی‌های بااهمیت دست ‌یافتیم. این روش ترکیبی، با کاهش بهینه ویژگی­ها سبب افزایش دقت در شناسایی درخواست، افزایش کارایی دسته‌بندی متن، کاهش زمان آموزش و محاسبات شد.
 
شماره‌ی مقاله: 12
متن کامل [PDF 1656 kb]   (37 دریافت)    
نوع مطالعه: كاربردي | موضوع مقاله: مقالات پردازش متن
دریافت: 1398/10/22 | پذیرش: 1399/12/12 | انتشار: 1401/7/8 | انتشار الکترونیک: 1401/7/8

فهرست منابع
1. [1] خبرگزاری تحلیلی ایران، "پیام‌رسان تلگرام در کدام کشورها طرفدار دارد؟"، ۱۱ تیر ۱۳۹۸، برگرفته از لینک: khabaronline.ir/news/1275665، به نقل از سایت: digitalinformationworld.com، تاریخ استخراج: 14 دی 1398.
2. [1] Iran Analytical News Agency, "In which countries do telegram messengers favor?", khabaronline.ir, July. 2, 2019. [Online]. Available: khabaronline.ir/news/1275665. [Accessed:4 January 2020].
3. [2] اقتصادنیوز، "آخرین آمار از محبوب‌ترین شبکه‌های اجتماعی در ایران "، اقتصادنیوز سایت مرجع اقتصاد ایران، 20 فروردین 1398، برگرفته از لینک: https://b2n.ir/661242، تاریخ استخراج: 14 دی 1398.
4. [2] Economics News, "Latest statistics from the mostpopular social networks in Iran", eghtesadnews.com, April. 9, 2019. [Online]. Available: https://b2n.ir/661242. [Accessed:4 January 2020].
5. [3] Wikipedia contributors, "Telegram (software)," Wikipedia, The Free Encyclopedia, 27 December 2019, 15:24 UTC. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Telegram_(software)&oldid=932678184. [Accessed:4 January 2020].
6. [4] S. Ranganath, X. Hu, J. Tang, S. Wang, and H. Liu, "Understanding and Identifying Rhetorical Questions in Social Media,", ACM Transactions on Intelligent Systems and Technology, vol. 9, pp. 1-22, 2018. [DOI:10.1145/3108364]
7. [5] J. Zhang, A. Spirling, and C. Danescu-Niculescu-Mizil, "Asking too much? The rhetorical role of questions in political discourse,", arXiv preprint arXiv:1708.02254, 2017. [DOI:10.18653/v1/D17-1164]
8. [6] S. Ranganath, S. Wang, X. Hu, J. Tang, and H. Liu, "Facilitating Time Critical Information Seeking in Social Media,", IEEE Transactions on Knowledge and Data Engineering, vol. PP, pp. 1-1, 2017.
9. [7] A. D. Walker, P. Alexopoulos, A. Starkey, J. Z. Pan, J. M. Gómez-Pérez, and A. Siddharthan, "Answer Type Identification for Question Answering, ", in Joint International Semantic Technology Conference. Springer, Cham, 2015, pp. 235-251. [DOI:10.1007/978-3-319-31676-5_17]
10. [8] W. He, S. Zha, and L. Li, "Social media competitive analysis and text mining: A case study in the pizza industry,", International Journal of Information Management, vol. 33, no. 3, pp. 464-472, Jun. 2013. [DOI:10.1016/j.ijinfomgt.2013.01.001]
11. [9] عزیزی وامرزانی, حامد و مریم خادمی. بررسی کاربرد و چالش‌های کلان داده در تحلیل عقاید. هفتمین کنفرانس ملی مهندسی برق و الکترونیک ایران، گناباد، دانشگاه آزاد اسلامی گناباد, 1394.
12. [9] H. A. Vamerzani and M. Khademi, "Exploring the Uses and Challenges of Big Data in Opinion Analysis," in Proceedings of the 7th Iranian Conference on Electrical and Electronics Engineering, Gonabad, Islamic Azad University of Gonabad, 2016.
13. [10] کیانی نژاد, محمد؛ طاهره هاشمی و محسن رشیدی. متن‌کاوی شبکه‌های اجتماعی برای احساسات و تمایلات مصرف‌کننده برند، ششمین کنفرانس بین‌المللی اقتصاد، مدیریت و علوم مهندسی، بلژیک، مرکز بین‌المللی ارتباطات دانشگاهی، ۱۳۹۴.
14. [10] M. Kiani nejad, T. hashemi, and M. rashidi, " Text mining social networks for consumer brand feelings and desires," in Proceedings of the 6th International Conference on Economics, Management and Engineering Sciences, Belgium, International Center for Academic Communication, 2016.
15. [11] D. Ö. Şahin and E. Kılıç, "Two new feature selection metrics for text classification," Automatika, vol. 60, no. 2, pp. 162-171, 2019. [DOI:10.1080/00051144.2019.1602293]
16. [12] A. K. Uysal, "An improved global feature selection scheme for text classification," Expert systems with Applications, vol. 43, pp. 82-92, 2016. [DOI:10.1016/j.eswa.2015.08.050]
17. [13] M. Nekkaa and D. Boughaci, "Hybrid harmony search combined with stochastic local search for feature selection," Neural Processing Letters, vol. 44, no. 1, pp. 199-220, 2016. [DOI:10.1007/s11063-015-9450-5]
18. [14] X. Deng, Y. Li, J. Weng, and J. Zhang, "Feature selection for text classification: A review,", Multimedia Tools and Applications, vol. 78, no. 3, pp. 3797-3816, 2019. [DOI:10.1007/s11042-018-6083-5]
19. [15] L. M. Abualigah, A. T. Khader, M. A. Al-Betar, and O. A. Alomari, "Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering," Expert Systems with Applications, vol. 84, pp. 24-36, 2017. [DOI:10.1016/j.eswa.2017.05.002]
20. [16] D. Agnihotri, K. Verma, and P. Tripathi, "Variable global feature selection scheme for automatic classification of text documents," Expert Systems with Applications, vol. 81, pp. 268-281, 2017. [DOI:10.1016/j.eswa.2017.03.057]
21. [17] G. BİRİCİK, B. Diri, and A. C. SÖNMEZ, "Abstract feature extraction for text classification," Turkish Journal of Electrical Engineering & Computer Sciences, vol. 20, no. Sup. 1, pp. 1137-1159, 2012. [DOI:10.3906/elk-1102-1015]
22. [18] A. Melo and H. Paulheim, "Local and global feature selection for multilabel classification with binary relevance," Artificial intelligence review, vol. 51, no. 1, pp. 33-60, 2019. [DOI:10.1007/s10462-017-9556-4]
23. [19] H. Ogura, H. Amano, and M. Kondo, "Distinctive characteristics of a metric using deviations from Poisson for feature selection," Expert Systems with Applications, vol. 37, no. 3, pp. 2273-2281, 2010. [DOI:10.1016/j.eswa.2009.07.045]
24. [20] R. Saidi, W. Bouaguel, and N. Essoussi, "Hybrid Feature Selection Method Based on the Genetic Algorithm and Pearson Correlation Coefficient," in Machine Learning Paradigms: Theory and Application, Springer, 2019, pp. 3-24. [DOI:10.1007/978-3-030-02357-7_1]
25. [21] N. Nicolosi, "Feature selection methods for text classification," Department of Computer Science, Rochester Institute of Technology, Tech. Rep, 2008.
26. [22] C. Huang, J. Zhu, Y. Liang, M. Yang, G. P. C. Fung, and J. Luo, "An efficient automatic multiple objectives optimization feature selection strategy for internet text classification," International Journal of Machine Learning and Cybernetics, vol. 10, no. 5, pp. 1151-1163, 2019. [DOI:10.1007/s13042-018-0793-x]
27. [23] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of machine learning research, vol. 3, no. Mar, pp. 1157-1182, 2003.
28. [24] G. Chandrashekar and F. Sahin, "A survey on feature selection methods," Computers & Electrical Engineering, vol. 40, no. 1, pp. 16-28, 2014. [DOI:10.1016/j.compeleceng.2013.11.024]
29. [25] Z. Zheng and R. Srihari, "Optimally combining positive and negative features for text categorization," In Proceedings of the ICML'03 Workshop on Learning from Imbalanced Data Sets, 2003.
30. [26] K. Quinn and O. Zaiane, 'Identifying questions & requests in conversation', in Proceedings of the 2014 International C* Conference on Computer Science & Software Engineering, 2014, pp. 1-6.
31. [27] B. Li, X. Si, M. R. Lyu, I. King, and E. Y. Chang, 'Question identification on twitter', in Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 2477-2480. [DOI:10.1145/2063576.2063996] [PMCID]
32. [28] B. Ojokoh, T. Igbe, A. Araoye, and F. Ameh, 'Question identification and classification on an academic question answering site', in 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), 2016, pp. 223-224. [DOI:10.1145/2910896.2925442]
33. [29] S. Ranganath, X. Hu, J. Tang, S. Wang, and H. Liu, 'Identifying Rhetorical Questions in Social Media.', in ICWSM, 2016, pp. 667-670.
34. [30] W. Cohen, V. Carvalho, and T. Mitchell, 'Learning to classify email into "speech acts"', in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004, pp. 309-316.
35. [31] A. Ramzy and A. Elazab, 'Question Identification in Arabic Language Using Emotional Based Features', arXiv preprint arXiv:2008.03843, 2020.
36. [32] B. Z. Abbasi, S. Hussain, S. Bibi, and M. A. Shah, "Impact of Membership and Non-membership Features on Classification Decision: An Empirical Study for Appraisal of Feature Selection Methods,", in 2018 24th International Conference on Automation and Computing (ICAC), 2018, pp. 1-6. [DOI:10.23919/IConAC.2018.8749009] [PMCID]
37. [33] A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney, "Feature selection methods for text classification," in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007, pp. 230-239. [DOI:10.1145/1281192.1281220]
38. [34] A. K. Uysal and S. Gunal, "A novel probabilistic feature selection method for text classification,", Knowledge-Based Systems, vol. 36, pp. 226-235, 2012. [DOI:10.1016/j.knosys.2012.06.005]
39. [35] B. Tang, S. Kay, and H. He, "Toward optimal feature selection in naive Bayes for text categorization," IEEE transactions on knowledge and data engineering, vol. 28, no. 9, pp. 2508-2521, 2016. [DOI:10.1109/TKDE.2016.2563436]
40. [36] S. L. Lam and D. L. Lee, "Feature reduction for neural network based text categorization," in Proceedings. 6th International Conference on Advanced Systems for Advanced Applications, 1999, pp. 195-202.
41. [37] N. G. R. Chawla, "Improved Feature Subset Selection using Hybrid Ant Colony and Perceptron Network," International Journal of Scientific Research and Management, vol. 5, no. 8, pp. 6764-6770, 2017.
42. [38] S. Gu, R. Cheng, and Y. Jin, "Feature selection for high-dimensional classification using a competitive swarm optimizer," Soft Computing, vol. 22, no. 3, pp. 811-822, 2018. [DOI:10.1007/s00500-016-2385-6]
43. [39] S. Choi, J. H. Shin, J. Lee, P. Sheridan, and W. D. Lu, "Experimental demonstration of feature extraction and dimensionality reduction using memristor networks," Nano letters, vol. 17, no. 5, pp. 3113-3118, 2017. [DOI:10.1021/acs.nanolett.7b00552] [PMID]
44. [40] H. Naji, W. Ashour, and M. Alhanjouri, "A New Model in Arabic Text Classification Using BPSO/REP-Tree," JOURNAL OF ENGINEERING RESEARCH AND TECHNOLOGY, vol. 4, pp. 28-42, 2017.
45. [41] N. Kumar, S. Mitra, M. Bhattacharjee, and L. Mandal, "Comparison of Different Classification Techniques Using Different Datasets," Singapore, 2019, pp. 261-272. [DOI:10.1007/978-981-13-1544-2_22]
46. [42] M. Labani, P. Moradi, F. Ahmadizar, and M. Jalili, "A novel multivariate filter method for feature selection in text classification problems," Engineering Applications of Artificial Intelligence, vol. 70, pp. 25-37, 2018. [DOI:10.1016/j.engappai.2017.12.014]
47. [43] M. A. Hall, "Correlation-based feature selection for machine learning," Doctoral dissertation, University of Waikato, Dept. of Computer Science, 1999.
48. [44] C. Liu, W. Wang, Q. Zhao, X. Shen, and M. Konan, "A new feature selection method based on a validity index of feature subset,", Pattern Recognition Letters, vol. 92, pp. 1-8, 2017. [DOI:10.1016/j.patrec.2017.03.018]
49. [45] V. Vapnik and V. Vapnik, "Statistical learning theory Wiley," New York, pp. 156-160, 1998.
50. [46] D. Sarkar, "Text Classification,", in Text Analytics with Python, Springer, 2019, pp. 275-342. [DOI:10.1007/978-1-4842-4354-1_5]
51. [47] M. B. Dastgheib and S. Koleini, "Persian Text Classification Enhancement by Latent Semantic Space," International Journal of Information Science and Management (IJISM), vol. 17, no. 1, p. 33, 2019.
52. [48] C. Qi, Z. Zhou, Y. Sun, H. Song, L. Hu, and Q. Wang, "Feature selection and multiple kernel boosting framework based on PSO with mutation mechanism for hyperspectral classification," Neurocomputing, vol. 220, pp. 181-190, 2017. [DOI:10.1016/j.neucom.2016.05.103]
53. [49] A. K. Uysal, "On two-stage feature selection methods for text classification,", IEEE Access, vol. 6, pp. 43233-43251, 2018. [DOI:10.1109/ACCESS.2018.2863547]
54. [50] M. Swamynathan, Mastering machine learning with python in six steps: A practical implementation guide to predictive data analytics using python. Apress, 2019. [DOI:10.1007/978-1-4842-4947-5]
55. [51] Z. Khalifeh-Zadeh, M. A. Z. Chahooki, "An Effective Method of Feature Selection in Persian Text for Improving the Accuracy of Detecting Request in Persian Messages on Telegram," Journal of Information Systems and Telecommunication (JIST), vol. 32, pp. 249-262, 2021. [DOI:10.29252/jist.8.32.249]

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.