یک روش خوشه‌بندی ترکیبی جدید مبتنی بر خوشه‌بند cmeans فازی با حفظ تنوع در اجماع

نجفی, فاطمه; پروین, حمید; میرزایی, کمال; نجاتیان, صمد; رضایی, سیده وحیده

doi:10.29252/jsdp.17.4.103

دوره 17، شماره 4 - ( 12-1399 ) جلد 17 شماره 4 صفحات 122-103 | برگشت به فهرست نسخه ها

‎ 10.29252/jsdp.17.4.103

Mendeley

Zotero

RefWorks

najafi F, parvin H, mirzaei K, nejatiyan S, rezaie S V. A new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble. JSDP 2021; 17 (4) :103-122
URL: http://jsdp.rcisp.ac.ir/article-1-976-fa.html

نجفی فاطمه، پروین حمید، میرزایی کمال، نجاتیان صمد، رضایی سیده وحیده. یک روش خوشه‌بندی ترکیبی جدید مبتنی بر خوشه‌بند cmeans فازی با حفظ تنوع در اجماع. پردازش علائم و داده‌ها. 1399; 17 (4) :103-122

URL: http://jsdp.rcisp.ac.ir/article-1-976-fa.html

یک روش خوشه‌بندی ترکیبی جدید مبتنی بر خوشه‌بند cmeans فازی با حفظ تنوع در اجماع

فاطمه نجفی

، حمید پروین^*

، کمال میرزایی

، صمد نجاتیان

، سیده وحیده رضایی

دانشکده فنی و مهندسی، واحد ممسنی، دانشگاه آزاد اسلامی

چکیده: (2448 مشاهده)

به‌علت بدون‌ناظر‌بودن مسأله خوشه‌بندی، انتخاب یک الگوریتم خاص جهت خوشه‌بندی یک مجموعه ناشناس امری پر‌خطر و به‌طورمعمول شکست‌خورده است. به‌خاطر پیچیدگی مسأله و ضعف روش‌های خوشه‌بندی پایه، امروزه بیش‌تر مطالعات به سمت روش‌های خوشه‌بندی ترکیبی هدایت شده است. در خوشه‌بندی ترکیبی ابتدا چندین خوشه‌بندی پایه تولید و سپس برای تجمیع آنها، از یک تابع توافقی جهت ایجاد یک خوشه‌بندی نهایی استفاده میشود که بیشینه شباهت را به خوشهبندیهای پایه داشته باشد. خوشه‌بندی توافقی تولید‌شده باید با استفاده از بیشترین اجماع و توافق به‌دست آمده باشد. ورودی تابع یادشده همه خوشه‌بندی‌های پایه و خروجی آن یک خوشه‌بندی بهنام خوشه‌بندی توافقی است. در‌حقیقت روش‌های خوشه‌بندی ترکیبی با این شعار که ترکیب چندین مدل ضعیف بهتر از یک مدل قوی است، به میدان آمده‌اند. با این‌وجود، این ادعا درصورتی درست است که برخی شرایط همانند تنوع بین اعضای موجود در اجماع و کیفیت آنها رعایت شده باشند. این مقاله یک روش خوشه‌بندی ترکیبی را ارائه داده که از روش خوشه‌بندی پایه ضعیف cmeans فازی به‌عنوان خوشه‌بند پایه استفاده کرده است. همچنین با اتخاذ برخی تمهیدات، تنوع اجماع را بالا برده است. روش خوشه‌بندی ترکیبی پیشنهادی مزیت الگوریتم خوشهبندی cmeans فازی را که سرعت آن است، دارد و همچنین ضعف‌های عمده آن را که عدم قابلیت کشف خوشه‌های غیر‌کروی و غیر‌یکنواخت است، ندارد. در بخش مطالعات تجربی الگوریتم خوشه‌بندی ترکیبی پیشنهادی با سایر الگوریتم‌های خوشه‌بندی مختلف به‌روز و قوی بر روی مجموعه داده‌های مختلف آزموده و با یکدیگر مقایسه شده است. نتایج تجربی حاکی از برتری کارایی روش پیشنهادی نسبت به سایر الگوریتم‌های خوشه‌بندی به‌روز و قوی است.

واژه‌های کلیدی: یادگیری ترکیبی، خوشه‌بندی ترکیبی، الگوریتم خوشه‌بندی cmeans فازی، اعتبار داده‌ها

متن کامل [PDF 7743 kb] (505 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1397/11/30 | پذیرش: 1399/7/6 | انتشار: 1399/12/4 | انتشار الکترونیک: 1399/12/4

فهرست منابع

1. [1] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2001.

2. [2] A.K. Jain, R.C. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.

3. [3] A.K. Jain, ''Data clustering: 50 years beyond Kmeans'', Pattern Recogni¬tion Letters, vol. 31, no. 8, pp. 651-666, 2010. [DOI:10.1016/j.patrec.2009.09.011]

4. [4] J.B. MacQueen, "Some methods for classification and analysis of multivariate observations". Proc. of 5-th Berkeley Symposium on Math¬ematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.

5. [5] A. Likas, M. Vlassis, J. Verbeek, "The global fc-means clustering algorithm", Pattern Recognition, vol. 35, no. 2, pp. 451-461, 2003. [DOI:10.1016/S0031-3203(02)00060-2]

6. [6] M. Ester, H. Kriegel, J. Sander, X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise", In Evangelos Simoudis, Jiawei Han, Usama M. Fayyad, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), AAAI Press, 1996, pp. 226-231.

7. [7] A. Rodriguez, A. Laio, "Clustering by fast search and find of density peaks", Science, vol. 344, no. 6191, pp. 1492-1496, 2014. [DOI:10.1126/science.1242072] [PMID]

8. [8] J. Shi, J. Malik, "Normalized cuts and image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, 2000. [DOI:10.1109/34.868688]

9. [9] A.Y. Ng, M.I. Jordan, Y. Weiss, ''On Spectral Clustering: Analysis and an Algorithm, in: T.G. Dietterich, S. Becker, Z. Ghahramani (Eds.)'', Advances in Neural Information Processing Systems, vol. 14, MIT Press, Cambridge, MA, 2002.

10. [10] A. Strehl, J. Ghosh, "Cluster ensembles: a knowledge reuse framework for combining multiple partitions", Journal on Machine Learning Re¬search, vol. 3, pp. 583-617, 2002.

11. [11] A. Gionis, H. Mannila, P. Tsaparas, "Clustering aggregation, ACM Transactions on Knowledge Discovery from Data", vol. 1, no. 1, pp. 1-30, 2007. [DOI:10.1145/1217299.1217303]

12. [12] Z. Zhou, Ensemble Methods: Foundations and Algorithms, CRC Press, 2012. [DOI:10.1201/b12207]

13. [13] E. Gonzlez, J. Turmo, "Unsupervised ensemble minority clustering", Machine Learning, vol.98, pp. 217-268, 2015. [DOI:10.1007/s10994-013-5394-z]

14. [14] N. Iam-On, T. Boongoen, "Comparative study of matrix refinement approaches for ensemble clustering", Machine Learning, vol. 98, pp. 269-300, 2015. [DOI:10.1007/s10994-013-5342-y]

15. [15] A. Fred, A. Jain, "Combining multiple clusterings using evidence accumulation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp.835-850, 2005. [DOI:10.1109/TPAMI.2005.113] [PMID]

16. [16] L. Kuncheva, D. Vetrov, "Evaluation of stability of kmeans cluster ensembles with respect to random initialization", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1798¬1808, 2006. [DOI:10.1109/TPAMI.2006.226] [PMID]

17. [17] X. Zhang, L. Jiao, F. Liu, L. Bo, M. Gong. "Spectral clustering ensemble applied to SAR image segmentation", IEEE Transactions on Geoscience and Remote Sensing, vol. 46, no. 7, pp. 2126-2136, 2008. [DOI:10.1109/TGRS.2008.918647]

18. [18] M. Law, A. Topchy, A. Jain, "Multiobjective data clustering", Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.

19. [19] Z. Yu, H. Chen, J. You, et al, "Hybrid fuzzy cluster ensemble framework for tumor clustering from bio-molecular data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 3, pp. 657¬670, 2013. [DOI:10.1109/TCBB.2013.59] [PMID]

20. [20] B. Fischer, J. Buhmann, "Bagging for path-based clustering", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 11, pp. 1411-1415, 2003. [DOI:10.1109/TPAMI.2003.1240115]

21. [21] A. Topchy, B. Minaei-Bidgoli, A. Jain, "Adaptive clustering ensembles", Proc. the 17th International Conference on Pattern Recognition, 2004. [DOI:10.1109/ICPR.2004.1334105]

22. [22] Z. Zhou, W. Tang, "Clusterer ensemble", Knowledge-Based Systems, vol. 19, no. 1, pp. 77-83, 2006. [DOI:10.1016/j.knosys.2005.11.003]

23. [23] Y. Hong, S. Kwong, H. Wang, Q. Ren, "Resampling-based selective clus¬tering ensembles", Pattern Recognition Letters, vol. 41(9), pp. 2742-2756, 2009. [DOI:10.1016/j.patcog.2008.03.007]

24. [24] X. Fern, C. Brodley, "Random projection for high dimensional data clus¬tering: A cluster ensemble approach", Proc. International Conference on Machine Learning, 2003.

25. [25] P. Zhou, L. Du, L. Shi, H. Wang et al., "Learning a robust consensus matrix for clustering ensemble via kullback-leibler divergence mini-mization", Proc. the 25th International Joint Conference on Artificial Intelligence, 2015.

26. [26] Z. Yu, L. Li, J. Liu et al., "Adaptive noise immune cluster ensemble using affinity propagation", IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 19, pp. 3176-3189, 2015. [DOI:10.1109/TKDE.2015.2453162]

27. [27] F. Gullo, C. Domeniconi, "Metacluster-based projective clustering en¬sembles", Machine Learning, vol. 98, no. 1-2, pp. 1-36, 2013. [DOI:10.1007/s10994-013-5395-y]

28. [28] Y. Yang, J. Jiang, "Hybrid Sampling-Based Clustering Ensemble with Global and Local Constitutions", Ieee Transactions on Neural Networks and Learning Systems, vol. 27, no. 5, pp. 952-965, 2016. [DOI:10.1109/TNNLS.2015.2430821] [PMID]

29. [29] A. Fred, A. K. Jain, "Data clustering using evidence accumulation", Proc. the 16th International Conference on Pattern Recognition, , 2002, pp. 276-280.

30. [30] Y. Yang, K. Chen, "Temporal data clustering via weighted clustering ensemble with different representations", IEEE Transactions on Knowl-edge and Data Engineering, vol. 23, no. 2, pp. 307-320, 2011. [DOI:10.1109/TKDE.2010.112]

31. [31] N. Iam-On, T. Boongoen, S. Garrett, C. Price, "A link-based approach to the cluster ensemble problem", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2396-2409, 2011. [DOI:10.1109/TPAMI.2011.84] [PMID]

32. [32] N. Iam-On, T. Boongoen, S. Garrett, C. Price, "A link-based cluster ensemble approach for categorical data clustering", IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 3, pp. 413-425, 2012. [DOI:10.1109/TKDE.2010.268]

33. [33] X. Fern, C. Brodley, "Solving cluster ensemble problems by bipartite graph partitioning", Proc. of the 21st International Conference on Machine Learning, 2004. [DOI:10.1145/1015330.1015414]

34. [34] D. Huang, J. Lai, C. D. Wang, "Ensemble clustering using factor graph", Pattern Recognition, vol. 50, pp. 131-142, 2016. [DOI:10.1016/j.patcog.2015.08.015]

35. [35] M. Selim, E. Ertunc, "Combining multiple clusterings using similarity graph", Pattern Recognition, vol. 44, no. 3, 694-703, 2011. [DOI:10.1016/j.patcog.2010.09.008]

36. [36] C. Boulis, M. Ostendorf, "Combining multiple clustering systems", Proc. European Conf. Principles and Practice of Knowledge Discovery in Databases, 2004. [DOI:10.1007/978-3-540-30116-5_9]

37. [37] A. Topchy, B. Minaei-Bidgoli, A. Jain, "Adaptive clustering ensembles", Proc. the 17th International Conference on Pattern Recognition, 2004. [DOI:10.1109/ICPR.2004.1334105]

38. [38] P. Hore, L. O. Hall, B. Goldgo, "A scalable framework for cluster ensembles", Pattern Recognition, vol. 42, no. 5, 676-688, 2009. [DOI:10.1016/j.patcog.2008.09.027] [PMID] [PMCID]

39. [39] B. Long, Z. Zhang, P. S. Yu, "Combining multiple clusterings by soft correspondence", Proc. the 4th IEEE International Conference on Data Mining, 2005.

40. [40] D. Cristofor, D. Simovici, "Finding median partitions using information theoretical based genetic algorithms", J. Universal Computer Science, vol. 8, no. 2, pp. 153-172, 2002.

41. [41] A. Topchy, A. Jain, W. Punch, "Clustering ensembles: Models of consensus and weak partitions", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, 1866-1881, 2005. [DOI:10.1109/TPAMI.2005.237] [PMID]

42. [42] H. Wang, H. Shan, A. Banerjee, "Bayesian cluster ensembles", Statisti¬cal Analysis and Data Mining, vol. 4, no. 1, pp. 54-70, 2011. [DOI:10.1002/sam.10098]

43. [43] Z. He, X. Xu, S. Deng, "A cluster ensemble method for clustering categorical data", Information Fusion, vol. 6, no. 2, pp. 143-151, 2005. [DOI:10.1016/j.inffus.2004.03.001]

44. [44] N. Nguyen, R. Caruana, "Consensus Clusterings", Proc. IEEE Intl Conf. Data Mining, 2007, pp. 607-612. [DOI:10.1109/ICDM.2007.73]

45. [45] Z. Huang, "Extensions to the kmeans algorithm for clustering large data sets with categorical values", Data Mining and Knowledge Discovery, vol. 2, no. 3, pp. 283-304, 1998. [DOI:10.1023/A:1009769707641]

46. [46] S. Abbasi, S. Nejatian, H. Parvin, V. Rezaie &K. Bagherifard, "Clustering ensemble selection considering quality and diversity," Artificial Intelligence Review, vol. 52, pp. 1311-1340, Springer Nature B.V. 2018, https://doi.org/10.1007/s10462-018-9642-2 [DOI:10.1007/s10462-018-9642-2.]

47. [47] A. Bagherinia, B. Minaei-Bidgoli, M. Hossinzadeh, H. Parvin, "Elite fuzzy clustering ensemble based on clustering diversity and quality measures," Springer Science+Business Media, LLC, part of Springer Nature, Applied Intelligence, vol.49 , PP. 1724-1747, 2019. https://doi.org/10.1007/s10489-018-1332-x [DOI:10.1007/s10489-018-1332-x.]

48. [48] A. Nazari, A. Dehghan, S Nejatian, V. Rezaie, H. Parvin, "A comprehensive study of clustering ensemble weighting based on cluster quality and diversity," Pattern Analysis and Applications, vol. 22, pp.133-145, 2019. [DOI:10.1007/s10044-017-0676-x]

49. [49] S. Guha, R. Rastogi, K. Shim, "Cure: an efficient clustering algorithm for large databases", Proc. of the Conference on Management of Data (ACM SIGMOD), pp.73-84, 1998. [DOI:10.1145/276305.276312]

50. [50] P.H.A. Sneath, R.R. Sokal, Numerical Taxonomy, Freeman, San Fran¬cisco, London, 1973.

51. [51] B. King, "Step-wise clustering procedures", Journal of the American State Association, vol. 69, pp. 86-101, 1967. [DOI:10.1080/01621459.1967.10482890]

52. [52] G. Karypis, E.-H.S. Han, V. Kumar, "Chameleon: ahierarchical cluster¬ing algorithm using dynamic modeling", IEEE Computer, vol. 32, no. 8, pp. 68-75, 1999. [DOI:10.1109/2.781637]

53. [53] J.C. Bezdek, N. R. Pal, "Some new indexes of cluster validity", IEEE Transactions on Systems Man and Cybernetics Part B, vol. 28, no. 3, pp. 301-15, 1998. [DOI:10.1109/3477.678624] [PMID]

54. [54] N.R. Pal, J.C. Bezdek, "On cluster validity for the fuzzy c-means model", IEEE Transactions on Fuzzy Systems, vol. 3, no. 3, pp. 370-379, 1995. [DOI:10.1109/91.413225]

55. [55] UCI Machine Learning Repository, http://www.ics.uci.edu /mlearn /ML- Repository.html, 2016.

56. [56] T. S. A. V. W. T. Press, W. H. and B. P. Flannery, Conditional Entropy and Mutual Information. Numerical Recipes: The Art of Scientific computing (3rd ed), New York: Cambridge University Press, 2007.

57. [57] F. Rashidi, S. Nejatian, H. Parvin, V. Rezaie, "Diversity based cluster weighting in cluster ensemble: an information theory approach," Artificial Intelligence Review, vol. 52, pp.1341-1368, 2019. [DOI:10.1007/s10462-019-09701-y]

58. [58] F. Najafi, H. Parvin, K. Mirzaie, S. Nejatian, V. Rezaie, "Dependability-based cluster weighting in clustering ensemble," Stat Anal Data Min: The ASA Data Sci Journal, vol. 13, pp. 151-164, 2020. [DOI:10.1002/sam.11451]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.

نظر شما در مورد قالب جدید چیست؟
	خوب
	متوسط
	ضعیف

پایگاه‌های مرتبط

واژگان کلیدی

نظرسنجی