Volume 13, Issue 2 (9-2016)                   JSDP 2016, 13(2): 11-23 | Back to browse issues page

XML Persian Abstract Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Moradi A, Shahbahrami A, Ebrahimi Atani R, Alidoust Nia M. Persian XML Documents Metaheuristic Clustering Based on Structure and Content Similarity. JSDP 2016; 13 (2) :11-23
URL: http://jsdp.rcisp.ac.ir/article-1-29-en.html
University of Guilan
Abstract:   (6594 Views)

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a document similarity measure used in information content, they can cause structural information contained in XML documents is ignored. In this paper, a new model named matrix space model to represent both structural and content features of documents in XML, is proposed. Based on this model, the Jaccard similarity measure is defined and the colonial competitive algorithm for clustering XML documents is used. Experimental results show that the proposed model function in identifying similar documents which closely identified with the same structure and content information are effective. This method can improve the accuracy of clustering, and XML data can be used to increase productivity.

Full-Text [PDF 2032 kb]   (2835 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2013/04/27 | Accepted: 2016/06/15 | Published: 2016/09/18 | ePublished: 2016/09/18

Add your comments about this article : Your username or Email:

Send email to the article author

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing