دوره 15، شماره 4 - ( 12-1397 )                   جلد 15 شماره 4 صفحات 30-17 | برگشت به فهرست نسخه ها


XML English Abstract Print


دانشگاه تربیت دبیر شهید رجایی
چکیده:   (3742 مشاهده)
خوشه‌­بندی ترکیبی، به ترکیب نتایج حاصل از خوشه­‌بندی­‌های موجود می‌­پردازد. پژوهش‌های دهۀ اخیر نشان می‌­دهد، چنان­چه به جای ترکیب همۀ خوشه­‌بندی‌­ها، تنها دست‌ه­ای از ­­­­­­­آن­ها بر اساس کیفیت و تنوع انتخاب شوند، آن­چه به‌­عنوان خروجی خوشه­بندی ترکیبی حاصل می‌شود، بسیار دقیق­‌تر خواهد بود. این مقاله به ارائه یک روش جدید برای انتخاب خوشه­‌بندی‌­ها بر اساس دو معیار کیفیت و تنوع می‌پردازد. برای رسیدن به این منظور ابتدا خوشه­‌بندی­‌های مختلفی با استفاده از الگوریتم k-means ایجاد می­‌شود که در هر بار اجرا، مقدار k یک عدد تصادفی است. در ادامه خوشه‌بندی­‌هایی که به این نحو تولید شده­اند، با استفاده از الگوریتم جدیدیکه براساس میزان شباهت بین خوشه‌بندی‌­های مختلف عمل می­‌کند، گروه‌­بندی می‌­شوند تا آن­‌دسته از خوشه‌­بندی‌­هایی که به یکدیگر شبیه‌­اند در یک دسته قرار گیرند؛ سپس از هر دسته، با استفاده از یک روش مبتنی بر رأی­‌گیری، با کیفیت‌­ترین عضو آن برای ایجاد خوشه‌­بندی ترکیبی انتخاب می‌شود. در این مقاله از سه تابع HPGA، CSPA و MCLA برای ترکیب خوشه‌­بندی‌­ها استفاده شده است. در انتها برای آزمایش  این روش جدید از  داده­‌های واقعی موجود در پایگاه داده UCI استفاده شده است. نتایج نشان می‌­دهد که روش جدید کارایی بیشتر و دقیق‌تری نسبت به روش‌­های قبلی دارد.
متن کامل [PDF 13502 kb]   (1032 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1395/10/11 | پذیرش: 1397/10/19 | انتشار: 1397/12/17 | انتشار الکترونیک: 1397/12/17

فهرست منابع
1. [1] فضل ارثی، احسان و کاظمی نوقابی، مسعود، "خوشه‌بندی داده¬ها بر پایه شناسایی کلید" فصلنامه پردازش علائم و داده¬ها؛ ۱۴ (4): 31-42 ؛ 1396.
2. [1] Fazl Ersi, Ehsan and Kazemi Noghabi, Masoud, "Clustering of Data Based on Key Identifica-ion",Journal of Signals and Data Processing (JSDP); 14 (4): 31-42; 2017. [DOI:10.29252/jsdp.14.4.31]
3. [2] A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: a review," ACM computing surveys (CSUR), vol. 31, pp. 264-323, 1999. [DOI:10.1145/331499.331504]
4. [3] H.-P. Kriegel, P. Kröger, and A. Zimek, "Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering," ACM Transactions on Knowledge Discovery from Data (TKDD), vol .3, pp. 1, 2009. [DOI:10.1145/1497577.1497578]
5. [4] A. Strehl and J. Ghosh, "Cluster ensembles---a knowledge reuse framework for combining mul-tiple partitions," Journal of machine learning re-search, vol. 3, pp. 583-617, 2002.
6. [5] S. Monti, P. Tamayo, J. Mesirov, and T. Golub, "Consensus clustering: a resampling-based me-thod for class discovery and visualization of gene expression microarray data," Machine learning, vol. 52, pp. 91-118, 2003.
7. [6] C. C. Aggarwal and C. K. Reddy, Data cluster-ing: algorithms and applications: CRC Press, 2013.
8. [7] R. Avogadri and G. Valentini, "Fuzzy ensemble clustering based on random projections for DNA microarray data analysis," Artificial Intelligence in Medicine, vol. 45, pp. 173-183, 2009. [DOI:10.1016/j.artmed.2008.07.014] [PMID]
9. [8] S. Mimaroglu and E. Erdil, "Obtaining better quality final clustering by merging a collection of clusterings," Bioinformatics, vol. 26, pp. 2645-2646, 2010. [DOI:10.1093/bioinformatics/btq489] [PMID]
10. [9] X. Ma, W. Wan, and L. Jiao, "Spectral clustering ensemble for image segmentation," in Proceed-ings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, 2009, pp. 415-420. [DOI:10.1145/1543834.1543890]
11. [10] E. Akbari, H. M. Dahlan, R. Ibrahim, and H. Alizadeh, "Hierarchical cluster ensemble selec-tion," Engineering Applications of Artificial Intelligence, vol. 39, pp. 146-156, 2015. [DOI:10.1016/j.engappai.2014.12.005]
12. [11] A. L. Fred and A. K. Jain, "Combining multiple clusterings using evidence accumulation," IEEE transactions on pattern analysis and machine intelligence, vol. 27, pp. 835-850, 2005. [DOI:10.1109/TPAMI.2005.113] [PMID]
13. [12] A. Topchy, A. K. Jain, and W. Punch, "Clustering ensembles: Models of consensus and weak partitions," IEEE Transactions on pattern analysis and machine intelligence, vol. 27, pp. 1866-1881, 2005. [DOI:10.1109/TPAMI.2005.237] [PMID]
14. [13] V. Berikov, "Weighted ensemble of algorithms for complex data clustering," Pattern Recogni-tion Letters, vol. 38, pp. 99-106, 2014. [DOI:10.1016/j.patrec.2013.11.012]
15. [14] Y. Hong, S. Kwong, Y. Chang, and Q. Ren, "Unsupervised feature selection using cluster-ing ensembles and population based incre-mental learning algorithm," Pattern Recogni-tion, vol. 41, pp. 2742-2756, 2008. [DOI:10.1016/j.patcog.2008.03.007]
16. [15] B. Minaei-Bidgoli, A. Topchy, and W. F. Punch, "Ensembles of partitions via data re-sampling," in Information Technology: Coding and Computing, 2004: Proceedings. ITCC 2004. International Conference on, 2004, pp. 188-192. [DOI:10.1109/ITCC.2004.1286629]
17. [16] Z.-H. Zhou, J. Wu, and W. Tang, "Ensembling neural networks: many could be better than all," Artificial intelligence, vol. 137, pp. 239-263, 2002. [DOI:10.1016/S0004-3702(02)00190-X]
18. [17] X. Z. Fern and W. Lin, "Cluster ensemble selection," Statistical Analysis and Data Min-ing, vol. 1, pp. 128-141, 2008. [DOI:10.1002/sam.10008]
19. [18] X. Wang, D. Han, and C. Han, "Rough set based cluster ensemble selection," in Informa-tion Fusion (FUSION), 2013 16th International Conference on, 2013, pp. 438-444.
20. [19] J. Azimi and X. Fern, "Adaptive Cluster Ensemble Selection," in IJCAI, 2009, pp. 992-997.
21. [20] L. I. Kuncheva and S. T. Hadjitodorov, "Using diversity in cluster ensembles," in Systems, man and cybernetics, 2004 IEEE international conference on, 2004, pp. 1214-1219.
22. [21] M. C. Naldi, A. Carvalho, and R. J. Campello, "Cluster ensemble selection based on relative validity indexes," Data Mining and Knowledge Discovery, vol. 27, pp. 259-289, 2013. [DOI:10.1007/s10618-012-0290-x]
23. [22] H. Alizadeh, B. Minaei-Bidgoli, and H. Parvin, "To improve the quality of cluster ensembles by selecting a subset of base clusters," Journal of Experimental & Theoretical Artificial Intelli-gence, vol. 26, pp. 127-150, 2014. [DOI:10.1080/0952813X.2013.813974]
24. [23] B. Minaei-Bidgoli, H. Parvin, H. Alinejad-Rokny, H. Alizadeh, and W. F. Punch, "Effects of resampling method and adaptation on clustering ensemble efficacy," Artificial Intelli-gence Review, vol. 41, pp. 27-48, 2014. [DOI:10.1007/s10462-011-9295-x]
25. [24] G. Karypis and V. Kumar, "A fast and high quality multilevel scheme for partitioning irre-gular graphs," SIAM Journal on scientific Com-puting, vol. 20, pp. 359-392, 1998. [DOI:10.1137/S1064827595287997]
26. [25] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, "Multilevel hypergraph partitioning: applications in VLSI domain," IEEE Transac-tions on Very Large Scale Integration (VLSI) Systems, vol. 7, pp. 69-79, 1999. [DOI:10.1109/92.748202]
27. [26] X. Lu, Y. Yang, and H. Wang, "Selective clustering ensemble based on covariance," in International Workshop on Multiple Classifier Systems, pp. 179-189, 2013. [DOI:10.1007/978-3-642-38067-9_16]
28. [27] L. Hubert and P. Arabie, "Comparing partitions," Journal of classification, vol. 2, pp. 193-218, 1985. [DOI:10.1007/BF01908075]
29. [28] D. A. Neumann and V. T. Norton, "Clustering and isolation in the consensus problem for partitions," Journal of classification, vol. 3, pp. 281-297, 1986. [DOI:10.1007/BF01894191]
30. [29] F. Yang, X. Li, Q. Li, and T. Li, "Exploring the diversity in cluster ensemble generation: Random sampling and random projection," Expert Systems with Applications, vol. 41, pp. 4844-4866, 2014. [DOI:10.1016/j.eswa.2014.01.028]
31. [30] J. Jia, X. Xiao, B. Liu, and L. Jiao, "Bagging-based spectral clustering ensemble selection," Pattern Recognition Letters, vol. 32, pp. 1456-1467, 2011. [DOI:10.1016/j.patrec.2011.04.008]
32. [31] J. Jia, X. Xiao, and B. Liu, "Similarity-based spectral clustering ensemble selection," in Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on, 2012, pp. 1071-1074. [DOI:10.1109/FSKD.2012.6233780]
33. [32] A. Banerjee, "Leveraging frequency and diversity based ensemble selection to consensus clustering," in Contemporary Computing (IC3), 2014 Seventh International Conference on, 2014, pp. 123-129. [DOI:10.1109/IC3.2014.6897160]
34. [33] D. L. Davies and D. W. Bouldin, "A cluster separation measure," IEEE transactions on pattern analysis and machine intelligence, pp. 224-227, 1979. [DOI:10.1109/TPAMI.1979.4766909]
35. [34] T. Caliński and J. Harabasz, "A dendrite method for cluster analysis," Communications in Statistics-theory and Methods, vol. 3, pp. 1-27, 1974. [DOI:10.1080/03610927408827101]
36. [35] W. S. Sarle, "Finding Groups in Data: An Introduction to Cluster Analysis," Journal of the American Statistical Association, vol. 86, pp. 830-833, 1991. [DOI:10.2307/2290430]
37. [36] M. Charrad, Y. Lechevallier, M. B. Ahmed, and G. Saporta, "On the Number of Clusters in Block Clustering Algorithms," in FLAIRS Conference, 2010.
38. [37] K. Bache and M. Lichman, "UCI machine lear-ning repository," 2013.
39. [38] A. L. Fred and A. K. Jain, "Data clustering using evidence accumulation," in Pattern Recogni-tion, 2002. Proceedings. 16th Inter-na-tional Conference on, 2002, pp. 276-280.

بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.