خوشه‌بندی ترکیبی با بیشینه‌سازی تنوع با به-کارگیری الگوریتم‌های بهینه‌سازی تکاملی

عباسی, صدراله; نجاتیان, صمد; پروین, حمید; رضایی, وحیده; باقری فرد, کرم اله

doi:10.61186/jsdp.19.4.95

دوره 19، شماره 4 - ( 12-1401 ) جلد 19 شماره 4 صفحات 120-95 | برگشت به فهرست نسخه ها

‎ 10.61186/jsdp.19.4.95

Mendeley

Zotero

RefWorks

Abbasi S, Nejatian S, Parvin H, Rezaei V, Bagheri Fard K. The ensemble clustering with maximize diversity using evolutionary optimization algorithms. JSDP 2023; 19 (4) : 8
URL: http://jsdp.rcisp.ac.ir/article-1-1144-fa.html

عباسی صدراله، نجاتیان صمد، پروین حمید، رضایی وحیده، باقری فرد کرم اله. خوشه‌بندی ترکیبی با بیشینه‌سازی تنوع با به-کارگیری الگوریتم‌های بهینه‌سازی تکاملی. پردازش علائم و داده‌ها. 1401; 19 (4) :95-120

URL: http://jsdp.rcisp.ac.ir/article-1-1144-fa.html

خوشه‌بندی ترکیبی با بیشینه‌سازی تنوع با به-کارگیری الگوریتم‌های بهینه‌سازی تکاملی

صدراله عباسی

، صمد نجاتیان^*

، حمید پروین

، وحیده رضایی

، کرم اله باقری فرد

دانشکده مهندسی برق، واحد یاسوج، دانشگاه آزاد اسلامی، کهگیلویه و بویراحمد، ایران

چکیده: (2026 مشاهده)

خوشه‌بندی داده‌ها یکی از مراحل اصلی در داده‌کاوی است که وظیفه کاوش الگوهای پنهان در داده‌های بدون برچسب را بر عهده دارد. به خاطر پیچیدگی مسئله و ضعف روش‌های خوشه‌بندی پایه، امروزه اکثر مطالعات بهسمت روش‌های خوشه‌بندی ترکیبی هدایت شده است. پراکندگی در نتایج اولیه یکی از مهم‌ترین عواملی است که می‌تواند در کیفیت نتایج نهایی اثرگذار باشد. همچنین، کیفیت نتایج اولیه نیز عامل دیگری است که در کیفیت نتایج حاصل از ترکیب موثر است. هر دو عامل در تحقیقات اخیر خوشه‌بندی ترکیبی مورد توجه قرار گرفته‌اند. در اینجا یک چارچوب جدید برای بهبود کارایی خوشه‌بندی ترکیبی پیشنهاد شده است که مبتنی بر استفاده از زیرمجموعه‌ای از خوشه‌های اولیه می‌باشند روش ارائه شده نشان میدهد که استفاده از زیرمجموعه‌ای از نتایج خوشه‌بندی‌های اولیه می‌تواند بهتر از استفاده از کل نتایج باشد همچنین معیاری را پشنهاد میدهد که چگونه نتایج اولیه نسبت به هم ارزیابی شوند. این تحقیق معیاری ارایه میدهد که به وسیله آن میتوان تشخیص داد کدام زیرمجموعه از نتایج اولیه می‌تواند منجر به بهبود عملکرد خوشه‌بندی ترکیبی شود. از آنجایی که الگوریتمهای هوشمند تکاملی توانستهاند اکثریت مسائل پیچیده مهندسی را حل نمایند، در این مقاله نیز از این روشهای هوشمند برای انتخاب زیرمجموعهای از خوشههای اولیه استفاده شده است. این انتخاب به کمک سه روش هوشمند (الگوریتم ژنتیک، شبیهسازی تبرید و الگوریتم ازدحام ذرات) انجام میگیرد. ایده‌های اصلی در روش‌های پیشنهادی برای انتخاب زیرمجموعه‌ای از خوشه‌ها، استفاده از خوشه‌های پایدار به کمک الگوریتمهای جستجوی هوشمند (الگوریتمهای تکاملی) می‌باشند. برای ارزیابی خوشه‌ها، از معیار پایداری مبتنی بر اطلاعات متقابل استفاده شده است. در آخر نیز خوشههای انتخاب شده را به کمک چندین روش ترکیب نهایی با هم جمع میکنیم. نتایج تجربی روی چندین مجموعه داده استاندارد و با معیارهای ارزیابی اطلاعات متقابل نرمال شده، فیشر و دقت در مقایسه با روشهای علیزاده، عظیمی، Berikov ، CLWGC، RCESCC، KME، CFSFDP،DBSCAB، NSC و Chenنشان می‌دهد که روشهای‌ پیشنهادی می‌تواند به طور موثری روش ترکیب کامل را بهبود دهد.

کلیدواژه‌ها: بهینه سازی محلی، پراکندگی، الگوریتمهای تکاملی، ماتریس همبستگی، پراکندگی.

شماره‌ی مقاله: 8

واژه‌های کلیدی: بهینه‌سازی محلی، تنوع، الگوریتم‌های تکاملی، ماتریس همبستگی، تنوع.

متن کامل [PDF 1239 kb] (669 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1399/2/28 | پذیرش: 1401/2/21 | انتشار: 1401/12/29 | انتشار الکترونیک: 1401/12/29

فهرست منابع

1. [1] Azimi J., The investigation of the Ensemble Clustering Diversity. MSc Thesis. Iran University of Science and Technology, 2006.

2. [1] عظیمی ج، " بررسی پراکندگی در خوشه‌بندی ترکیبی"، پایان‌نامه کارشناسیارشد، دانشگاه علم و صنعت ایران، خرداد 1386.

3. [2] Alizadeh A., Minaei-Bidgoli B., Parvin H. Cluster ensemble selection based on a new cluster stability measure. Intell. Data Anal. 18(3): 389-408, 2014. [DOI:10.3233/IDA-140647]

4. [3] Jain A., Murty M. N., and Flynn P. (1999), Data clustering: A review. ACM Computing Surveys, 31(3):264-323. [DOI:10.1145/331499.331504]

5. [4] Faceli K., Marcilio C.P. Souto d., Multi-objective Clustering Ensemble, Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (HIS'06), 2006. [DOI:10.1109/HIS.2006.264934]

6. [5] Strehl A. and Ghosh J., "Cluster ensembles - a knowledge reuse framework for combining multiple partitions". Journal of Machine Learning Research, 3(Dec):583-617, 2002.

7. [6] Fred, A. and Jain, A.K. "Data Clustering Using Evidence Accumulation", Proc. of the 16th Intl. Conf. on Pattern Recognition, ICPR02, Quebec City, pp. 276 - 280, 2002.

8. [7] Topchy, A., Jain, A.K. and Punch, W.F., "Combining Multiple Weak Clusterings", Proc. 3d IEEE Intl. Conf. on Data Mining, pp. 331-338, 2003.

9. [8] Fred A. and Lourenco A. (2008), "Cluster Ensemble Methods: from Single Clusterings to Combined Solutions", Studies in Computational Intelligence (SCI), 126, 3-30. [DOI:10.1007/978-3-540-78981-9_1]

10. [9] Ayad H.G. and Kamel M.S., Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters, IEEE Trans. on Pattern Analysis and Machine Intelligence, VOL. 30, NO. 1, 160-173, 2008. [DOI:10.1109/TPAMI.2007.1138] [PMID]

11. [10] Minaei-Bidgoli B., Topchy A. and Punch W.F., "Ensembles of Partitions via Data Resampling", in Proc. Intl. Conf. on Information Technology, ITCC 04, Las Vegas, 2004. [DOI:10.1109/ITCC.2004.1286629]

12. [11] Parvin H., Minaei-Bidgoli B. "A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm". Pattern Anal. Appl. 18(1): 87-112, 2015. [DOI:10.1007/s10044-013-0364-4]

13. [12] Alizadeh H., Minaei-Bidgoli B., Parvin H. Optimizing Fuzzy Cluster Ensemble in String Representation. IJPRAI 27(2), 2013. [DOI:10.1142/S0218001413500055]

14. [13] Parvin H., Minaei-Bidgoli B., Alinejad-Rokny H., Punch W.F. "Data weighing mechanisms for clustering ensembles". Computers & Electrical Engineering 39(5): 1433-1450, 2013. [DOI:10.1016/j.compeleceng.2013.02.004]

15. [14] Barthelemy J.P. and Leclerc B., The median procedure for partition, In Partitioning Data Sets, AMS DIMACS Series in Discrete Mathematics, Cox, I. J. et al eds., 19, pp. 3-34, 1995. [DOI:10.1090/dimacs/019/01]

16. [15] Fern X.Z., and Lin W., "Cluster Ensemble Selection". Statistical Analysis and Data Mining 1(3): 128-141, 2008. [DOI:10.1002/sam.10008]

17. [16] Parvin H., Mirnabibaboli M., Alinejad-Rokny H. "Proposing a classifier ensemble framework based on classifier selection and decision tree". Eng. Appl. of AI 37: 34-42, 2015. [DOI:10.1016/j.engappai.2014.08.005]

18. [17] Dudoit S. and Fridlyand, J., Bagging to improve the accuracy of a clustering procedure, Bioinformatics, 19 (9), pp. 1090-1099, 2003. [DOI:10.1093/bioinformatics/btg038] [PMID]

19. [18] Fischer B. and Buhmann J.M., "Bagging for path-based clustering", IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1411-1415, 2003. [DOI:10.1109/TPAMI.2003.1240115]

20. [19] Fred A. and Jain A.K., "Robust data clustering", in: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR ,USA, vol. II, pp. 128-136, 2003.

21. [20] Fred A.L. and Jain A.K. "Combining Multiple Clusterings Using Evidence Accumulation". IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(6):835-850, 2005. [DOI:10.1109/TPAMI.2005.113] [PMID]

22. [21] Kuncheva L.I. and Hadjitodorov S. "Using diversity in cluster ensembles". In Proc. of IEEE Intl. Conference on Systems, Man and Cybernetics, pages 1214-1219, 2004.

23. [22] Kuncheva L.I. and Whitaker C. J., "Measures of diversity in classifier ensembles", Machine Learning, 2003.

24. [23] Baumgartner R., Somorjai R., Summers R., Richter W., Ryner L., and Jarmasz M., Resampling as a Cluster Validation Technique in fMRI, JOURNAL OF MAGNETIC RESONANCE IMAGING 11: pp. 228-231, 2000. https://doi.org/10.1002/(SICI)1522-2586(200002)11:2<228::AID-JMRI23>3.0.CO;2-Z [DOI:10.1002/(SICI)1522-2586(200002)11:23.0.CO;2-Z]

25. [24] Breckenridge J., Replicating cluster analysis: Method, consistency and validity, Multivariate Behavioral research, 1989. [DOI:10.1207/s15327906mbr2402_1] [PMID]

26. [25] Shamiry O., Tishby N., "Cluster Stability for Finite Samples", 21st Annual Conference on Neural Information Processing Systems (NIPS07), 2007.

27. [26] Roth V., Braun M.L., Lange T., and Buhmann J.M., "Stability-Based Model Order Selection in Clustering with Applications to Gene Expression Data", ICANN 2002, LNCS 2415, pp. 607-612, 2002a. [DOI:10.1007/3-540-46084-5_99]

28. [27] Roth V., Lange T., Braun M., and Buhmann J., A "Resampling Approach to Cluster Validation", Intl. Conf. on Computational Statistics, COMPSTAT, 2002b. [DOI:10.1007/978-3-642-57489-4_13]

29. [28] Saha A., Das S. "Categorical fuzzy k-modes clustering with automated feature weight learning". Neurocomputing 166: 422-435, 2015. [DOI:10.1016/j.neucom.2015.03.037]

30. [29] Law M.H.C., Topchy A.P., and Jain A.K. "Multiobjective data clustering". In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 424-430, Washington D.C, 2004.

31. [30] Akbari E., Dahlan H.M., Ibrahim R., Alizadeh H.: Hierarchical cluster ensemble selection. Eng. Appl. of AI 39: 146-156 2015. [DOI:10.1016/j.engappai.2014.12.005]

32. [31] Iam-On, N. and T. Boongoen, "Diversity-driven generation of link-based cluster ensemble and application to data classification", Expert Systems with Applications, 42(21): p. 8259-8273, 2015. [DOI:10.1016/j.eswa.2015.06.051]

33. [32] Melanie M., "An Introduction to Genetic Algorithms", A Bradford Book The MIT Press, Cambridge, Massachusetts. London, England, Fifth printing, 1999.

34. [33] Aarts E. H. L. and Korst J. Simulated Annealing and Boltzmann Machines, John Wiley & Sons, Essex, U.K, 1989.

35. [34] Kennedy J and Eberhart R.C., "Particle Swarm Optimization", Proceedings of IEEE International Conference on Neural Networks", Piscataway, NJ, pp. 1942-1948, 1995.

36. [35] Fred A. and Jain A.K., "Learning Pairwise Similarity for Data Clustering", In Proc. of the 18th Int. Conf. on Pattern Recognition (ICPR'06), 2006. [DOI:10.1109/ICPR.2006.754]

37. [36] Fridlyand J. and Dudoit S. "Applications of resampling methods to estimate the number of clusters and to improve the accuracy of a clustering method". Stat. Berkeley Tech Report. No. 600, 2001.

38. [37] X. Fern, C. Brodley, "Solving cluster ensemble problems by bipartite graph partitioning", Proc. of the 21st International Conference on Machine Learning, 2004. [DOI:10.1145/1015330.1015414]

39. [38] D. Huang, J. Lai, C. D. Wang, "Ensemble clustering using factor graph", Pattern Recognition, vol. 50, pp. 131-142, 2016. [DOI:10.1016/j.patcog.2015.08.015]

40. [39] M. Selim, E. Ertunc, "Combining multiple clusterings using similarity graph", Pattern Recognition, vol. 44, no. 3, 694-703, 2011. [DOI:10.1016/j.patcog.2010.09.008]

41. [40] C. Boulis, M. Ostendorf, "Combining multiple clustering systems", Proc. European Conf. Principles and Practice of Knowledge Discovery in Databases, 2004. [DOI:10.1007/978-3-540-30116-5_9]

42. [41] A. Topchy, B. Minaei-Bidgoli, A. Jain, "Adaptive clustering ensembles", Proc. the 17th International Conference on Pattern Recognition, 2004. [DOI:10.1109/ICPR.2004.1334105]

43. [42] P. Hore, L. O. Hall, B. Goldgo, "A scalable framework for cluster ensembles", Pattern Recognition, vol. 42, no. 5, 676-688, 2009. [DOI:10.1016/j.patcog.2008.09.027] [PMID] []

44. [43] B. Long, Z. Zhang, P. S. Yu, "Combining multiple clusterings by soft correspondence", Proc. the 4th IEEE International Conference on Data Mining, 2005.

45. [44] D. Cristofor, D. Simovici, "Finding median partitions using information theoretical based genetic algorithms", J. Universal Computer Science, vol. 8, no. 2, pp. 153-172, 2002.

46. [45] A. Topchy, A. Jain, W. Punch, "Clustering ensembles: Models of consensus and weak partitions", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, 1866-1881, 2005. [DOI:10.1109/TPAMI.2005.237] [PMID]

47. [46] H. Wang, H. Shan, A. Banerjee, "Bayesian cluster ensembles", Statistical Analysis and Data Mining, vol. 4, no. 1, pp. 54-70, 2011. [DOI:10.1002/sam.10098]

48. [47] Z. He, X. Xu, S. Deng, "A cluster ensemble method for clustering categorical data", Information Fusion, vol. 6, no. 2, pp. 143C151, 2005. [DOI:10.1016/j.inffus.2004.03.001]

49. [48] N. Nguyen, R. Caruana, "Consensus Clusterings", Proc. IEEE Intl Conf. Data Mining, pp. 607-612, 2007. [DOI:10.1109/ICDM.2007.73]

50. [49] Z. Huang, "Extensions to the kmeans algorithm for clustering large data sets with categorical values", Data Mining and Knowledge Discovery, vol. 2, no. 3, pp. 283-304, 1998. [DOI:10.1023/A:1009769707641]

51. [50] S. Abbasi, S. Nejatian, H. Parvin, V. Rezaie &K. Bagherifard, "Clustering ensemble selection considering quality and diversity, " Artificial Intelligence Review, vol. 52, PP. 1311-1340, Springer Nature B.V. 2018, https://doi.org/10.1007/s10462-018-9642-2 [DOI:10.1007/s10462-018-9642-2.]

52. [51] A. Bagherinia, B. Minaei-Bidgoli, M. Hossinzadeh, H. Parvin, "Elite fuzzy clustering ensemble based on clustering diversity and quality measures, " Springer Science+Business Media, LLC, part of Springer Nature, Applied Intelligence, 49, PP. 1724-1747, 2019. https://doi.org/10.1007/s10489-018-1332-x [DOI:10.1007/s10489-018-1332-x.]

53. [52] A. Nazari, A. Dehghan, S Nejatian, V. Rezaie, H. Parvin, "A comprehensive study of clustering ensemble weighting based on cluster quality and diversity, " Pattern Analysis and Applications, vol. 22, pp.133-145, 2019. [DOI:10.1007/s10044-017-0676-x]

54. [53] M. Mojarad, S. Nejatian, H. Parvin, M. Mohammadpoor, "A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters, " The International Journal of Research on Systems for Real Life Complex Problems, Applied Intelligence vol. 49, pp. 2567-2581, 2019. [DOI:10.1007/s10489-018-01397-x]

55. [54] Z. Chen, A. Bagherinia B. Minaei-Bidgoli, H. Parvin, Pho KH. Fuzzy Clustering Ensemble Considering Cluster Dependability. International Journal on Artificial Intelligence Tools. 2021 Mar 26;30(02):2150007 [DOI:10.1142/S021821302150007X]

56. [55] V. Berikov, "A probabilistic model of fuzzy clustering ensemble." Pattern Recognition and Image Analysis 28, no. 1 (2018): 1-10. [DOI:10.1134/S1054661818010029]

57. [56] moradi M, nejatian S, parvin H, bagherifard K, rezaei V. Clustering and Memory-based Parent-Child Swarm Meta-heuristic Algorithm for Dynamic Optimization. JSDP 2021; 18 (3) :127-146 [DOI:10.52547/jsdp.18.3.127]

58. [57] Omidvar M, Nejatian S, Parvin H, Bagherifard K, Rezaie V. Providing an algorithm for solving general optimization problems based on Domino theory. JSDP 2022; 19 (2) :87-106 [DOI:10.52547/jsdp.19.2.87]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.

نظر شما در مورد قالب جدید چیست؟
	خوب
	متوسط
	ضعیف

پایگاه‌های مرتبط

واژگان کلیدی

نظرسنجی