mojarad M, parvin H, nejatiyan S, Bagheri Fard K A. Combining an Ensemble Clustering Method and a New Similarity Criterion for Modeling the Hereditary Behavior of Diseases. JSDP. 2021; 18 (2) :97-114
Background: There are many theories about the causes of hereditary diseases, but physician believe that both the genetic and environmental factors simultaneously play an important role in the development and progression of these diseases, although the extent to which this effect is not yet clear. In order to detect effective genes in the development of diseases, it is necessary to achieve the relationship between cells/tissues.
Objective: In fact, inter-cell or inter-tissue communications indicate the hereditary relationships between patients. Detecting these communications help to identify common parts of the body that are influenced by various diseases. The interaction between different cells/tissues can be demonstrated by expressing the gene between them. By sampling chromosomes, useful information is obtained about the type of disease and how it is transmitted. By examining this information, you can identify disorders that have led to highly altered changes. In previous research, various clustering methods have been used to discover the links between diseases based on gene expression data. However, ensembl clustering approaches have not yet been used for this purpose.
Method: In this paper, the recognition of intercellular and inter-tissue interactions in various diseases have been done according to the characteristics of the topological structure of the graph and an improved ensembl clustering method. The proposed clustering algorithm uses an agreed similarity function to measure the similarity between objects. The proposed method has two stages; in the first step, several clustering models are combined to identify the initial relationships between cells or tissues in order to produce better results than individual algorithms. In the second stage, the similarity between cells or tissues in each cluster is calculated by using a similarity criterion based on the topological structure of the graph. Eventually, the maximum similarity between cells or tissues in each cluster is used to discover the relationship between diseases. In addition, an algorithm for improving the uncertainty of objects is evaluated by allocating them to other clusters in order to enhance the quality of the final clusters.
Results: To evaluate the performance of the proposed method, several UCI datasets and the FANTOM5 dataset have been used. The results of the proposed method on the phantom data set 5 report a silhouette of 0.901 in 18 clusters for cells and 0.762 in 13 clusters for tissues.
Conclusion: The conducted evaluations have confirmed the power of the proposed clustering algorithm in terms of accuracy. Clustering of cells or tissues has increased the accuracy and concentration of the topological similarity criterion of the graph in the range of similarity of cells or tissues.
Type of Study: Applicable | Subject: Paper
Received: 2019/02/26 | Accepted: 2020/08/18 | Published: 2021/10/8 | ePublished: 2021/10/8

