Volume 17, Issue 1 (6-2020)                   JSDP 2020, 17(1): 29-46 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

taheri khameneh B, shokrzadeh H. Hierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics. JSDP 2020; 17 (1) :29-46
URL: http://jsdp.rcisp.ac.ir/article-1-882-en.html
Department of Computer Enginiering, Pardis Branch, Islamic Azad University
Abstract:   (2539 Views)
This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of information on the Web has led to some problems, the most important one is search query. Nowadays, search engines use different techniques to deliver high quality results, but we still see that search results are not ideal. It should also be noted that information retrieval techniques to a certain extent can increase the search accuracy. Most of the web content is designed for human usage and machines are only able to understand and manipulate data at word level. This is the major limitation for providing better services to web users. The solution provided for this topic is to display the content of the web in such a way that it can be readily understood and comprehensible to the machine. This solution, which will lead to a huge transformation on the Web is called the Semantic Web and will begin. Better results for responding to the search for semantic web users, is the purpose of this research. In the proposed method, the expression, searched by the user, will be examined according to the related topics. The response obtained from this section enters to a rating system, which is consisted of a fuzzy decision-making system and a hierarchical clustering system, to return better results to the user. It should be noted that the proposed method does not require any prior knowledge for clustering the data. In addition, accuracy and comprehensiveness of the response are measured. Finally, the F test is applied to obtain a criterion for evaluating the performance of the algorithm and systems. The results of the test show that the method presented in this paper can provide a more precise and comprehensive response than its similar methods and it increases the accuracy up to 1.22%, on average.
Full-Text [PDF 6895 kb]   (701 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2018/07/14 | Accepted: 2019/07/10 | Published: 2020/06/21 | ePublished: 2020/06/21

References
1. [1] I. J. Chiang, C. C. H. Liu, Y. H. Tsai, and A. Kumar, "Discovering Latent Semantics in Web Documents Using Fuzzy Clustering," IEEE Transactions on Fuzzy Systems, vol. 23, no. 6, pp. 2122-2134, 2015. [DOI:10.1109/TFUZZ.2015.2403878]
2. [2] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 2009, pp. 496. [DOI:10.1017/CBO9780511809071]
3. [3] K. R. Pole and V. R. Mote, "Name Entity Recognition and Natural Language Processing for Improvised Fuzzy clustering in Web Documents," Intenational Journal of Advance Research in Science and Engineering, vol. 6, no. 09, 2017. [DOI:10.1109/ICISIM.2017.8122161]
4. [4] D. Bollegala, Y. Matsuo, and M. Ishizuka, "A Web Search Engine-Based Approach to Measure Semantic Similarity between Words," IEEE Transactions on Knowledge and Data Engi-neering, vol. 23, no. 7, pp. 977-990, 2011. [DOI:10.1109/TKDE.2010.172]
5. [5] B. Jiang, Z. Li, H. Chen, and A. G. Cohn, "Latent Topic Text Representation Learning on Statistical Manifolds," IEEE Transactions on Neural Networks and Learning Systems, pp. 1-12, 2018. [DOI:10.1109/TNNLS.2018.2808332] [PMID]
6. [6] C. S. S. Kumar, M. Mohanapriya, and C. Kalaiarasan, "A new approach for information retrieval in semantic web mining involving weighted relationship," in 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), 2017, pp. 1-4. [DOI:10.1109/ICIIECS.2017.8276095]
7. [7] R. Zhao and K. Mao, "Fuzzy Bag-of-Words Model for Document Representation," IEEE Transactions on Fuzzy Systems, vol. 26, no. 2, pp. 794-804, 2018. [DOI:10.1109/TFUZZ.2017.2690222]
8. [8] M. K. Rafsanjani, Z. A. Varzaneh, and N. E. Chukanlo, "A Survey Of Hierarchical Clustering Algorithms," Journal of Mathematics and Computer Science(JMCS), vol. 5, no. 3, pp. 229-240, 2012. [DOI:10.22436/jmcs.05.03.11]
9. [9] H. Park, K. Kwon, A. i. Z. Khiati, J. Lee, and I. J. Chung, "Agglomerative Hierarchical Clustering for Information Retrieval Using Latent Semantic Index," in 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), 2015, pp. 426-431. [DOI:10.1109/SmartCity.2015.108]
10. [10] D. Rahmawati, G. A. P. Saptawati, and Y. Widyani, "Document clustering using sequential pattern (SP): Maximal frequent sequences (MFS) as SP representation," in 2015 International Conference on Data and Software Engineering (ICoDSE), 2015, pp. 98-102. [DOI:10.1109/ICODSE.2015.7436979]
11. [11] G. Bordogna and G. Pasi, "Hierarchical-Hyperspherical Divisive Fuzzy C-Means (H2D-FCM) Clustering for Information Retrieval," in 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, 2009, vol. 1, pp. 614-621. [DOI:10.1109/WI-IAT.2009.104]
12. [12] Nisha and P. J. Kaur, "Cluster quality based performance evaluation of hierarchical clustering method," in 2015 1st International Conference on Next Generation Computing Technologies (NGCT), 2015, pp. 649-653. [DOI:10.1109/NGCT.2015.7375201]
13. [13] C. Subbalakshmi, G. R. Krishna, S. K. M. Rao, and P. V. Rao, "A Method to Find Optimum Number of Clusters Based on Fuzzy Silhouette on Dynamic Data Set," Procedia Computer Science, vol. 46, pp. 346-353, 2015. [DOI:10.1016/j.procs.2015.02.030]
14. [14] M. Kaur and U. Kaur, "Comparison between k-means and hierarchical algorithm using query redirection," International Journal of Advanced Research in Computer Science and Software Engineering, vol. 3, no. 7, 2013.
15. [15] J. T. Chien, "Hierarchical Theme and Topic Modeling," IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 3, pp. 565-578, 2016. [DOI:10.1109/TNNLS.2015.2414658] [PMID]
16. [16] Q. Mao, W. Zheng, L. Wang, Y. Cai, V. Mai, and Y. Sun, "Parallel Hierarchical Clustering in Linearithmic Time for Large-Scale Sequence Analysis," in 2015 IEEE International Conference on Data Mining, 2015, pp. 310-319. [DOI:10.1109/ICDM.2015.90] [PMCID]
17. [17] A. J. C. Trappey, C. V. Trappey, F. C. Hsu, and D. W. Hsiao, "A Fuzzy Ontological Knowledge Document Clustering Methodology," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 3, pp. 806-814, 2009. [DOI:10.1109/TSMCB.2008.2009463] [PMID]
18. [18] T. X. Society, S. Wang, Q. Jiang, and J. Z. Huang, "A Novel Variable-order Markov Model for Clustering Categorical Sequences," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 10, pp. 2339-2353, 2014. [DOI:10.1109/TKDE.2013.104]
19. [19] A. Ensan and Y. Biletskiy, "Matching semi-structured documents using similarity of regions through fuzzy rule-based system," in Industrial Conference on Data Mining, 2013, pp. 205-217: Springer. [DOI:10.1007/978-3-642-39736-3_16]
20. [20] D. A. Grossman and O. Frieder, Information retrieval: Algorithms and heuristics. Springer Science & Business Media, 2012.
21. [21] A. N. Langville and C. D. Meyer, Google's PageRank and beyond: The science of search engine rankings, Princeton University Press, 2011.
22. [22] D. M. W. Powers, "Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation," Journal of Machine Learning Technologies, pp. 37-63, 2011.
23. [23] A. Dalli, "Adaptation of the f-measure to cluster based lexicon quality," EACL 2003 Workshop on Evaluation, pp. 51-56, 2003. [DOI:10.3115/1641396.1641404]
24. [24] C. J. v. RIJSBERGEN, INFORMATION RETRIEVAL. Newton, MA: Butterworth, 1979.

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing