استخراج ویژگی مبتنی بر تفکیک‌پذیری بیشتر رده‌ها با استفاده از طبقه‌‌‌بندهای کمکی

غفاری, حمیدرضا; جلالی مجاهد, آتنا

doi:10.52547/jsdp.18.2.29

دوره 18، شماره 2 - ( 7-1400 ) جلد 18 شماره 2 صفحات 44-29 | برگشت به فهرست نسخه ها

‎ 10.52547/jsdp.18.2.29

Mendeley

Zotero

RefWorks

Ghaffari H R, Jalali Mojahed A. Feature extraction based on the more resolution of the classes using auxiliary classifiers. JSDP 2021; 18 (2) :29-44
URL: http://jsdp.rcisp.ac.ir/article-1-986-fa.html

غفاری حمیدرضا، جلالی مجاهد آتنا. استخراج ویژگی مبتنی بر تفکیک‌پذیری بیشتر رده‌ها با استفاده از طبقه‌‌‌بندهای کمکی. پردازش علائم و داده‌ها. 1400; 18 (2) :29-44

URL: http://jsdp.rcisp.ac.ir/article-1-986-fa.html

استخراج ویژگی مبتنی بر تفکیک‌پذیری بیشتر رده‌ها با استفاده از طبقه‌‌‌بندهای کمکی

حمیدرضا غفاری^*

، آتنا جلالی مجاهد

دانشگاه آزاد اسلامی فردوس

چکیده: (2160 مشاهده)

طبقه‌بندی یک روش یادگیری ماشین است که برای پیش‌گویی برچسب یک نمونه خاص با کمترین خطا استفاده می‌شود. در این مقاله، از توانایی پیش‌گویی برچسب به‌کمک طبقه‌بند برای ایجاد ویژگی جدید استفاده شده است. امروزه روش‌های استخراج ویژگی زیادی مانند PCA و ICA وجود دارند که در زمینه‌های مختلف به‌طور وسیع استفاده می‌شوند و از هزینه بالای انتقال به فضای دیگر رنج می‌برند. در روش پیشنهادی، هدف این است که به‌کمک ویژگی جدید، قدرت تفکیک‌پذیری بیشتری بین رده‌های مختلف ایجاد شود و داده‌های درون رده‌ها به یکدیگر نزدیک‌تر و تمایز بیشتری بین داده‌های رده‌های مختلف به وجود آید تا کارایی طبقه‌بندها افزایش یابد. ابتدا به‌کمک یک یا چند طبقه‌بند، برچسب پیشنهادی برای مجموعه‌داده اولیه تعیین و به‌عنوان ویژگی جدید به مجموعه‌داده اولیه اضافه می‌شود. ایجاد مدل به‌کمک مجموعه‌داده جدید انجام می‌شود. ویژگی جدید برای مجموعه‌داده آموزش و آزمون به‌صورت جداگانه به‌دست آورده می‌شود. آزمایش‌ها بر روی بیست مجموعه‌داده استاندارد انجام شده و نتایج روش پیشنهادی با نتایج دو روش بیان‌شده در کارهای مرتبط نیز مقایسه شده است. نتایج نشان می‌دهد که روش پیشنهادی به‌طور قابل توجهی باعث بهبود دقت رده‌بندی شده است. در بخش دوم آزمایش‌ها، برای بررسی میزان مؤثر‌بودن روش پیشنهادی، قدرت تفکیک‌پذیری ویژگی جدید بر اساس دو معیار بهره اطلاعاتی و شاخص جینی بررسی شده است. نتایج نشان می‌دهد که ویژگی به‌دست‌‌آمده در روش پیشنهادی در بیشتر موارد دارای بهره اطلاعاتی بیشتر و شاخص جینی کمتری است، زیرا بی‌نظمی کمتری دارد. در ادامه، جهت جلوگیری از افزایش ابعاد داده، ویژگی استخراج‌شده با بیش‌ترین بار اطلاعاتی، جایگزین ویژگی با کم‌ترین بار اطلاعاتی شده است. نتایج این مرحله نیز بیان‌گر افزایش میزان کارایی است.

واژه‌های کلیدی: استخراج ویژگی، طبقه‌بندی، بهره اطلاعاتی، شاخص جینی

متن کامل [PDF 969 kb] (1077 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1397/12/23 | پذیرش: 1399/5/28 | انتشار: 1400/7/16 | انتشار الکترونیک: 1400/7/16

فهرست منابع

1. [1] S. J.Russell and P. Norvig, Artificial intelligence: a modern approach. Malaysia, Pearson Education Limited, 2016.

2. [2] R. P. Duin and D. M. J. Tax, "Statistical pattern recognition", In Handbook of Pattern Recognition and Computer Vision, pp. 3-24, 2005. [DOI:10.1142/9789812775320_0001]

3. [3] Guyon, S. Gunn, M.Nikravesh, L. A. Zadeh, and editors, Feature extraction: foundations and applications, Vol. 207. Springer, 2008.

4. [4] T. Cover and P. Hart, "Nearest neighbor pattern classification," IEEE transactions on information theory, vol. 13, no. 1, pp. 21-27, 1967. [DOI:10.1109/TIT.1967.1053964]

5. [5] C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, no. 3, pp. 273-297, 1995. [DOI:10.1007/BF00994018]

6. [6] J. Showe-Taylor and N. Christianini, Support vector machines and other kernel-based learning methods, 2000. [DOI:10.1017/CBO9780511801389] [PMCID]

7. [7] J. R. Quinlan, C4. 5: programs for machine learning. Elsevier, 2014.

8. [8] B. W. Silverman and M. C. Jones, "An important contribution to nonparametric discriminant analysis and density estimation," International statistical review/revue Internationale de statistique, pp. 233-238, 1989. [DOI:10.2307/1403796]

9. [9] R. C.Barros, M. P. Basgalupp, A. C. De Carvalho, and A. A. Freitas, "A survey of evolutionary algorithms for decision-tree induction," IEEETransactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no 3, pp. 291-312, 2012. [DOI:10.1109/TSMCC.2011.2157494]

10. [10] L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, 2001. [DOI:10.1023/A:1010933404324]

11. [11] M.Woźniak, M.GrañaandE. Corchado, "A survey of multiple classifier systems as hybrid systems," Information Fusion, vol. 16, pp. 3-17, 2014. [DOI:10.1016/j.inffus.2013.04.006]

12. [12] H. Hotelling, "Analysis of a Complex of Statistical Variables into Principal Components," Journal of Educational Psychology, vol. 24, no. 6, pp. 417-441, 1933. [DOI:10.1037/h0071325]

13. [13] P. Comon, "Independent component analysis, a new concept?" Signal Processing, vo. 36, no. 3, pp. 287-314, 1994. [DOI:10.1016/0165-1684(94)90029-9]

14. [14] K. Fukunaga, "Introduction to Statistical Pattern Recognition," San Diego: Academic Press Inc, 1990. [DOI:10.1016/B978-0-08-047865-4.50007-7] [PMID]

15. [15] C. F. Tsai and C. Y Lin, "A triangle area based nearest neighbors approach to intrusion detection," Pattern recognition. vol. 43, no. 1, pp. 222-229, 2010. [DOI:10.1016/j.patcog.2009.05.017]

16. [16] W. C.Lin, S. W. Ke and C. F. Tsai, "CANN: An intrusion detection system based on combining cluster centers and nearest neighbors", Knowledge-based systems, no. 78, pp. 13-21, 2015. [DOI:10.1016/j.knosys.2015.01.009]

17. [17] X. Wang, C. Zhang and K. Zheng, "Intrusion detection algorithm based on density, cluster centers, and nearest neighbors", China Communications, vol. 13, no. 7, pp. 24-31, 2016. [DOI:10.1109/CC.2016.7559072]

18. [18] A. Asuncion and D. J. Newman, UCI Machine Learning Repository, University of California, 2007. https://archive.ics.uci.edu/ml/index.php

19. [19] C. W. Hsua and C. J. Lin, "A comparison of methods for multiclass support vector machines", IEEE transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, 2002. [DOI:10.1109/72.991427] [PMID]

20. [20] T. T. Wong, "Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation", Pattern Recognition, vol. 48, no. 9, pp. 2839-2846, 2015. [DOI:10.1016/j.patcog.2015.03.009]

21. [21] J. T. Townsend, "Theoretical analysis of an alphabetic confusion matrix", Perception & Psychophysics, vol. 9, no. 1, pp. 40-50, 1971. [DOI:10.3758/BF03213026]

22. [22] M. Dash and H. Liu, "Consistency-based search in feature selection", Artificial intelligence, vol. 151, no. 1-2, pp. 155-176, 2003. [DOI:10.1016/S0004-3702(03)00079-1]

23. [23] J. R. Quinlan, "Induction of decision trees", Machine learning, vol. 1, no. 1, pp. 81-106, 1986. [DOI:10.1007/BF00116251]

24. [24] L. Breiman, "Classification and regression trees".Routledge, 2017. [DOI:10.1201/9781315139470]

25. [25] L. E. Raileanu and K. Stoffel, "Theoretical comparison between the gini index and information gain criteria", Annals of Mathematics and Artificial Intelligence, vol. 41, no. 1, pp.77-93, 2004. [DOI:10.1023/B:AMAI.0000018580.96245.c6]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.

نظر شما در مورد قالب جدید چیست؟
	خوب
	متوسط
	ضعیف

پایگاه‌های مرتبط

واژگان کلیدی

نظرسنجی