Volume 13, Issue 2 (9-2016)                   JSDP 2016, 13(2): 11-23 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Moradi A, Shahbahrami A, Ebrahimi Atani R, Alidoust Nia M. Persian XML Documents Metaheuristic Clustering Based on Structure and Content Similarity. JSDP. 2016; 13 (2) :11-23
URL: http://jsdp.rcisp.ac.ir/article-1-29-en.html
University of Guilan
Abstract:   (4375 Views)

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a document similarity measure used in information content, they can cause structural information contained in XML documents is ignored. In this paper, a new model named matrix space model to represent both structural and content features of documents in XML, is proposed. Based on this model, the Jaccard similarity measure is defined and the colonial competitive algorithm for clustering XML documents is used. Experimental results show that the proposed model function in identifying similar documents which closely identified with the same structure and content information are effective. This method can improve the accuracy of clustering, and XML data can be used to increase productivity.

Full-Text [PDF 2032 kb]   (2088 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2013/04/27 | Accepted: 2016/06/15 | Published: 2016/09/18 | ePublished: 2016/09/18

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


© 2015 All Rights Reserved | Signal and Data Processing