Tag recommendation in social networks with the help of text summarization and KNN

Rahimi Resketi, Mahsa; Motameni, Homayun; Akbari, Ebrahim; Nematzadeh, Hossein

Volume 21, Issue 4 (3-2025) JSDP 2025, 21(4): 15-28 | Back to browse issues page

Mendeley

Zotero

RefWorks

Rahimi Resketi M, Motameni H, Akbari E, Nematzadeh H. Tag recommendation in social networks with the help of text summarization and KNN. JSDP 2025; 21 (4) : 2
URL: http://jsdp.rcisp.ac.ir/article-1-1326-en.html

Tag recommendation in social networks with the help of text summarization and KNN

Mahsa Rahimi Resketi

, Homayun Motameni ^*

, Ebrahim Akbari

, Hossein Nematzadeh

Islamic Azad University, Sari

Abstract: (354 Views)

In recent years, the utilization of social networks has surged markedly, with interest in their use escalating daily. A pivotal concern is augmenting the number of views for individuals' posts or messages to enhance their popularity. The most effective means to achieve this objective is through the use of tags. Tags significantly contribute to the organization and retrieval of existing data, and the automatic generation of tags has garnered substantial attention. Tag recommendation from textual sources can be approached as a text extraction issue. This paper endeavors to propose a comprehensive set of suggested keywords derived from data via advanced text summarization techniques, culminating in the presentation of a sophisticated tag recommender. Consequently, this research introduces an innovative and robust solution by integrating clustering, summarization, and recommendation methodologies. Initially, utilizing the Bag of Words (BoW) model, comprehensive word parsing and extraction of word roots are performed. This process yields a bag of words capable of facilitating deep semantic exploration. The data is meticulously simplified to its core elements, with prepositions and repetitions omitted. Verbs, due to their high frequency and significance depending on the context of the sentence or post, are mined separately. Other words are judiciously selected based on their frequency and importance, and stored with their repetition counts. Subsequently, employing the K-Nearest Neighbor (KNN) clustering algorithm, the data is clustered, and the cluster representatives serve as the output tags. A slight modification is made to the KNN algorithm by incorporating the Explicit Semantic Analysis (ESA) method for precise scale calculations.
The proposed solution was rigorously evaluated on two public datasets: TPA, extracted by Aminer, and AG, extracted by ComeToMyHead. The AG dataset comprises 127,600 news articles, categorized into four distinct tag types. Each category contains 30,000 training samples and 1,900 test samples, with a total of 31,900 tags representing global, sports, business, and scientific concepts. The findings of this study were compared with those from 13 similar research papers, which fall into four distinct categories: machine learning, long-short-term memory (LSTM), convolutional neural network (CNN), and capsule-based models. The comparative analysis revealed that the proposed method demonstrates superior accuracy, comprehensive coverage, and an enhanced F-measure.
The integration of advanced text analytics techniques underscores the significance of this study in the broader context of information retrieval and data mining. By harnessing the power of semantic analysis and machine learning, this research provides a novel framework that not only enhances the efficiency of tag recommendation systems but also contributes to the theoretical foundation of automated keyword extraction. The implications of these findings are far-reaching, with potential applications extending beyond social networks to other domains requiring efficient data organization and retrieval.

Article number: 2

Keywords: label recommendation, text summarization, word embedding, k-nearest neighbor, BoW

Full-Text [PDF 1036 kb] (132 Downloads)

Type of Study: Research | Subject: Paper
Received: 2022/07/19 | Accepted: 2024/12/4 | Published: 2025/04/2 | ePublished: 2025/04/2

References

1. Zhong, S., et al., "Topic representation: A novel method of tag recommendation for text," in IEEE 2nd International Conference on Big Data Analysis (ICBDA), 2017, pp. 671-676. [DOI:10.1109/ICBDA.2017.8078720]

2. [Hasan, K. and V. Ng, "Automatic Keyphrase Extraction: A Survey of the State of the Art," Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1. 2014. pp. 1262-1273. [DOI:10.3115/v1/P14-1119]

3. Messina, A. and M. Montagnuolo., "Fuzzy mining of multimedia genre applied to television archives," in 2008 IEEE International Conference on Multimedia and Expo. 2008. [DOI:10.1109/ICME.2008.4607385] [PMID]

4. Rahul, S. Rauniyar, and Monika, "A Survey on Deep Learning based Various Methods Analysis of Text Summarization," in 2020 International Conference on Inventive Computation Technologies (ICICT), 2020. [DOI:10.1109/ICICT48043.2020.9112474]

5. Radev, D. and K. McKeown, "Introduction to the Special Issue on Text Summarization," Computational Linguistics, vol. 28, pp. 399 - 408, 2002. [DOI:10.1162/089120102762671927]

6. Oliveira, H., et al., "Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization," Expert Systems with Applications, vol. 65: p. 68-86, 2016. [DOI:10.1016/j.eswa.2016.08.030]

7. Ferreira, R., et al., "Assessing sentence scoring techniques for extractive text summarization," Expert Systems with Applications, vol. 40, pp. 5755-5764, 2013. [DOI:10.1016/j.eswa.2013.04.023]

8. Kiyoumarsi, F. and F. Esfahani, "Optimizing Persian Text Summarization Based on Fuzzy Logic Approach," International Conference on Intelligent Building and Management, pp.264-269 2011.

9. Fang, C., H. Kesong, and C. Guilin. "An approach to sentence-selection-based text summarization," in 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering. TENCOM '02. Proceedings, 2002.

10. Sankarasubramaniam, Y., K. Ramanathan, and S. Ghosh, "Text summarization using Wikipedia," Information Processing & Management, vol. 50, pp. 443-461, 2016. [DOI:10.1016/j.ipm.2014.02.001]

11. Janjanam, P. and C.P. Reddy. "Text Summarization: An Essential Study," in 2019 International Conference on Computational Intelligence in Data Science (ICCIDS), 2019. [DOI:10.1109/ICCIDS.2019.8862030]

12. Tandel, A., et al. "Multi-document text summarization - a survey," in 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), 2016. [DOI:10.1109/SAPIENCE.2016.7684115]

13. McKeown, K., et al., "Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster," Morgan Kaufmann Publishers Inc, 2003. [DOI:10.3115/1289189.1289212]

14. McKeown, K., et al." From text to speech summarization," Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing 2005. vol. 5, pp. 997-v1000. [DOI:10.1109/ICASSP.2005.1416474]

15. Lei, K., et al., "Tag Recommendation by Text Classification with Attention-Based Capsule Network," Neurocomputing, pp.65-73, 2020. [DOI:10.1016/j.neucom.2020.01.091]

16. Zuo, Y., et al., "A Tag-aware Recommendation Algorithm Based on Deep Learning and Multi-objective Optimization," in 2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA, pp.42-46, 2023. [DOI:10.1109/PRMVIA58252.2023.00013]

17. Chatti, M.A., et al., "Tag-based collaborative filtering recommendation in personal learning environments," IEEE Transactions on Learning Technologies, vol. 6, pp. 337-349, 2016. [DOI:10.1109/TLT.2013.23]

18. Shepitsen, A., et al., "Personalized recommendation in social tagging systems using hierarchical clustering," RecSys'08: Proceedings of the 2008 ACM Conference on Recommender Systems, 2008, pp. 259-266. [DOI:10.1145/1454008.1454048]

19. Symeonidis, P., A. Nanopoulos, and Y. Manolopoulos, "Tag recommendations based on tensor dimensionality reduction," RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, 2008. pp. 43-50. [DOI:10.1145/1454008.1454017]

20. Shang, M.-S., et al., "Collaborative filtering with diffusion-based similarity on tripartite graphs," Physica A: Statistical Mechanics and its Applications, vol. 389 pp. 1259-1264, 2010. [DOI:10.1016/j.physa.2009.11.041]

21. Tang, J., et al., "Cross-domain collaboration recommendation," in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 2012, Association for Computing Machinery: Beijing, China. pp. 1285-1293. [DOI:10.1145/2339530.2339730] [PMID]

22. Zhang, X., J. Zhao, and Y. LeCun, "Character-level convolutional networks for text classification," in Proceedings of the 28th International Conference on Neural Information Processing Systems - Vol 1. 2015, MIT Press: Montreal, Canada. pp. 649-657.

23. Li, H., et al., "Read, Watch, Listen, and Summarize: Multi-Modal Summarization for Asynchronous Text, Image, Audio and Video," IEEE Transactions on Knowledge and Data Engineering, 2019, vol. 31, pp. 996-1009. [DOI:10.1109/TKDE.2018.2848260]

24. Moradi, M. and N. Ghadiri, "Different approaches for identifying important concepts in probabilistic biomedical text summarization," Artificial Intelligence in Medicine, 2018, vol. 84, pp. 101-116, 2018. [DOI:10.1016/j.artmed.2017.11.004] [PMID]

25. Rautray, R. and R.C. Balabantaray, "An evolutionary framework for multi document summarization using Cuckoo search approach," MDSCSA. Applied Computing and Informatics, vol. 14, pp. 134-144, 2018. [DOI:10.1016/j.aci.2017.05.003]

26. Rautray, R. and R.C. Balabantaray, "Comparative Study of DE and PSO over Document Summarization," in Intelligent Computing, Communication and Devices, L.C. Jain, S. Patnaik, and N. Ichalkaranje, Editors. Springer India: New Delhi. pp. 371-377, 2015. [DOI:10.1007/978-81-322-2012-1_38]

27. Nasr Azadani, M., N. Ghadiri, and E. Davoodijam, "Graph-based biomedical text summarization: An itemset mining and sentence clustering approach," Journal of Biomedical Informatics, vol. 84: pp. 42-58, 2018. [DOI:10.1016/j.jbi.2018.06.005] [PMID]

28. Baralis, E., et al., "MWI-Sum: A Multilingual Summarizer Based on Frequent Weighted Itemsets," ACM Transactions on Information Systems, vol. 34, pp. 1-35, 2015. [DOI:10.1145/2809786]

29. Tohalino, J.V. and D.R. Amancio. "Extractive Multi-document Summarization Using Dynamical Measurements of Complex Networks," in 2017 Brazilian Conference on Intelligent Systems (BRACIS), 2017. [DOI:10.1109/BRACIS.2017.41]

30. Afsharizadeh, M., H. Ebrahimpour-Komleh, and A. Bagheri. "Query-oriented text summarization using sentence extraction technique," in 2018 4th International Conference on Web Research (ICWR). 2018. [DOI:10.1109/ICWR.2018.8387248]

31. Lei, K., et al., "Tag recommendation by text classification with attention-based capsule network," Neurocomputing, vol. 391, pp. 65-73, 2020. [DOI:10.1016/j.neucom.2020.01.091]

32. Feng, W. and J. Wang, "Incorporating heterogeneous information for personalized tag recommendation in social tagging systems," in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 2012, Association for Computing Machinery: Beijing, China, pp. 1276-1284. [DOI:10.1145/2339530.2339729]

33. Fang, X., et al., "Personalized tag recommendation through nonlinear tensor factorization using gaussian kernel," in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015, AAAI Press: Austin, Texas. pp. 439-445. [DOI:10.1609/aaai.v29i1.9214]

34. Zhao, W., Z. Guan, and Z. Liu, "Ranking on heterogeneous manifolds for tag recommendation in social tagging services,

35. " Neurocomputing, vol. 148: pp. 521-534, 2015. [DOI:10.1016/j.neucom.2014.07.011]

36. Krestel, R., P. Fankhauser, and W. Nejdl, "Latent dirichlet allocation for tag recommendation," in Proceedings of the third ACM conference on Recommender systems, 2009, Association for Computing Machinery: New York, New York, USA. pp. 61-68. [DOI:10.1145/1639714.1639726]

37. Weston, J., S. Chopra, and K. Adams, "#TagSpace: Semantic Embeddings from Hashtags," Conference: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1822-1827. [DOI:10.3115/v1/D14-1194]

38. Ghabayen, A. and S.A. Mohd Noah, "Using Tags for Measuring the Semantic Similarity of Users to Enhance Collaborative Filtering Recommender Systems," International Journal on Advanced Science, Engineering and Information Technology (IJASEIT), vol 7, pp. 2063-2070, 2017. [DOI:10.18517/ijaseit.7.6.1826]

39. حمیدزاده، جواد، مرادی، منا، «بهبود پالایش مشارکتی در سیستم‌های توصیه‌گر با کمک خوشه‌بندی فازی C-میانگین مرتب ‌شده و الگوریتم ازدحام ذرات تطبیقی - آشوبی»، پردازش علائم و داده‌ها، شمارة 1 (59 پیاپی)، صص 111-122، 1403.

40. بحرانی، پیام، مینایی بیدگلی، بهروز، پروین، حمید، میرزا رضایی، میترا، و کشاورز، احمد، «ارائة یک سامانة پیشنهادگر محتوا-مشارکتی مبتنی بر خوشه‌بندی و هستان‌شناسی»، پردازش علائم و داده‌ها، شمارة 3 (53 پیاپی)، صص 147-1402،162.

41. Pan, R., P. Dolog, and G. Xu., "KNN-Based Clustering for Improving Social Recommender Systems," in Agents and Data Mining Interaction, Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. [DOI:10.1007/978-3-642-36288-0_11]

42. Gemmell, J., et al., "Adapting K-nearest neighbor for tag recommendation in Folksonomies," in Proceedings of the 7th International Conference on Intelligent Techniques for Web Personalization & Recommender Systems - Vol. 528. 2009, CEUR-WS.org: Pasadena, California. pp. 69-80.

43. School of Software, X.U., Urumqi 830008, China, et al., "Extractive based Text Summarization Using KMeans and TF-IDF," International Journal of Information Engineering and Electronic Business, 2019. 11(3): p. 33-44, 2019. [DOI:10.5815/ijieeb.2019.03.05]

44. Mahdi, A.E., A. Alahmadi, and A. Joorabchi, "Combining Bag-of-Words and Bag-of-Concepts Representations for Arabic Text Classification," in 25th IET Irish Signals & Systems Conference 2014 and 2014 China-Ireland International Conference on Information and Communities Technologies (ISSC 2014/CIICT 2014). 2014. Institution of Engineering and Technology. [DOI:10.1049/cp.2014.0711]

45. Weinberger, K., et al., "Feature hashing for large scale multitask learning," in Proceedings of the 26th Annual International Conference on Machine Learning. 2009, Association for Computing Machinery: Montreal, Quebec, Canada. pp. 1113-1120. [DOI:10.1145/1553374.1553516]

46. Li, P., C. Tang, and X. Xu, "Video summarization with a graph convolutional attention network," Frontiers of Information Technology & Electronic Engineering," vol. 22(6): pp. 902-913, 2021. [DOI:10.1631/FITEE.2000429]

47. Zhang, K., et al. "Video Summarization with Long Short-Term Memory," in Computer Vision - ECCV 2016, 2016. Cham: Springer International Publishing. [DOI:10.1007/978-3-319-46478-7_47]

48. ji, Z., et al., "Video Summarization With Attention-Based Encoder-Decoder Networks," IEEE Transactions on Circuits and Systems for Video Technology, 2017. PP 208-214.

49. Zhao, B., X. Li, and X. Lu, "Property-Constrained Dual Learning for Video Summarization," IEEE Transactions on Neural Networks and Learning Systems, vol. 31(10): pp. 3989-4000, 2020 [DOI:10.1109/TNNLS.2019.2951680] [PMID]

50. Zhang, P. and Z. Yang, "A Novel AdaBoost Framework With Robust Threshold and Structural Optimization," IEEE Transactions on Cybernetics, vol. 48(1): p. 64-76., 2018 [DOI:10.1109/TCYB.2016.2623900] [PMID]

51. Wang, Y., et al., "Bernoulli random forests: closing the gap between theoretical consistency and empirical soundness," in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 2016, AAAI Press: New York, New York, USA. pp. 2167-2173.

52. Friedman, J.H., "Stochastic gradient boosting," Computational Statistics & Data Analysis, vol. 4, pp. 367-378, 2002. [DOI:10.1016/S0167-9473(01)00065-2]

53. Cho, K., et al., "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation," Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, vol. 1, pp.1724-1734. [DOI:10.3115/v1/D14-1179]

54. Zhang, S., et al. "Bidirectional Long Short-Term Memory Networks for Relation Classification," in PACLIC, 2015., pp. 73-78.

55. Zhou, P., et al., "Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification," Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. pp. 207-212. [DOI:10.18653/v1/P16-2034]

56. Kim, Y., "Convolutional Neural Networks for Sentence Classification," Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014. [DOI:10.3115/v1/D14-1181]

57. Yin, W., et al., "ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs," Transactions of the Association for Computational Linguistics, vol. 4, 2015. [DOI:10.1162/tacl_a_00097]

58. Schwenk, H., et al. "Very Deep Convolutional Networks for Text Classification," in EACL, 2017.

59. Sabour, S., N. Frosst, and G.E. Hinton, "Dynamic routing between capsules," in Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, Curran Associates Inc.: Long Beach, California, USA. pp. 3859-3869.

60. Yang, M., et al., "Investigating Capsule Networks with Dynamic Routing for Text Classification," Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 3110-3119. [DOI:10.18653/v1/D18-1350] [PMID] []

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Signal and Data Processing

Vote