دوره 20، شماره 4 - ( 12-1402 )                   جلد 20 شماره 4 صفحات 34-23 | برگشت به فهرست نسخه ها


XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Hamidzadeh J, Rashidi Mahmoodi M A, Moradi M. Online Learning for Imbalanced Data Streams with Concept Drift by Belief Theory and Chaotic Function. JSDP 2024; 20 (4) : 2
URL: http://jsdp.rcisp.ac.ir/article-1-1246-fa.html
حمیدزاده جواد، رشیدی محمودی محمدعلی، مرادی منا. یادگیری برخط داده‌های جریانی نامتوازن دارای رانش مفهوم به‌وسیله نظریه باور و تابع آشوب. پردازش علائم و داده‌ها. 1402; 20 (4) :23-34

URL: http://jsdp.rcisp.ac.ir/article-1-1246-fa.html


دانشگاه سجاد
چکیده:   (388 مشاهده)
خصوصیات داده‌های جریانی در گذر زمان ناپایدار بوده و توزیع طبقات متحمل تغییرات می‌شوند؛ بنابراین مدل‌های یادگیری اغلب نیاز به تطبیق با رانش مفاهیم دارند. در این مقاله، با هدف حل دو چالش نبود توازن میان طبقات مشاهده‌شده و وقوع رانش مفهوم، طبقه‌بند داده‌های جریانی نامتوازن دارای رانش مفهوم ارائه شده است. روش پیشنهادی سعی در حذف داده‌های جریانی مرزی و نوفه‌ای با کمک خوشه‌بندی دارد. داده‌ها با کمک تابع باور وزن­دهی شده است و با درنظرگرفتن برچسب داده‌ها، نمونه­افزایی در نواحی کم­تراکم طبقه کمینه و با رویکرد آشوبی انجام می‌گیرد. سپس، با تعریف حدود آستانه، رانش مفهوم پدید آمده از نوع تدریجی و افزایشی شناسایی می‌شود. پیش‌بینی برچسب به‌وسیله طبقه‌بند ترکیبی و رأی­گیری وزن‌دار بیشینه انجام می‌پذیرد. عملکرد روش پیشنهادی بر روی مجموعه­داده‌های پایگاه داده UCI به‌وسیله روش LOO ارزیابی و با طبقه‌بندهای مرز دانش مقایسه شده­است. نتایج آزمایش‌ها نشان‌دهنده برتری روش پیشنهادی از نظر معیارهای ارزیابی است.
 
شماره‌ی مقاله: 2
متن کامل [PDF 807 kb]   (96 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1400/4/10 | پذیرش: 1402/4/14 | انتشار: 1403/2/6 | انتشار الکترونیک: 1403/2/6

فهرست منابع
1. [1] G. Douzas, R. Rauch, and F. Bacao, "G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE," Expert Systems with Applications, vol. 183, p. 115230, 2021, doi: https://doi.org/10.1016/j.eswa.2021.115230 [DOI:10.1016/j.eswa.2021.115230.]
2. [2] J. Engelmann and S. Lessmann, "Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning," Expert Systems with Applications, vol. 174, p. 114582, 2021, doi: https://doi.org/10.1016/j.eswa.2021.114582 [DOI:10.1016/j.eswa.2021.114582.]
3. [3] X. Xie, H. Liu, S. Zeng, L. Lin, and W. Li, "A novel progressively undersampling method based on the density peaks sequence for imbalanced data," Knowledge-Based Systems, vol. 213, p. 106689, 2021, doi: https://doi.org/10.1016/j.knosys.2020.106689 [DOI:10.1016/j.knosys.2020.106689.]
4. [4] G. Douzas, F. Bacao, and F. Last, "Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE," Information Sciences, vol. 465, pp. 1-20, 2018. [DOI:10.1016/j.ins.2018.06.056]
5. [5] Z. Xu, D. Shen, T. Nie, Y. Kou, N. Yin, and X. Han, "A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data," Information Sciences, vol. 572, pp. 574-589, 2021, doi: https://doi.org/10.1016/j.ins.2021.02.056 [DOI:10.1016/j.ins.2021.02.056.]
6. [6] Z. Li, W. Huang, Y. Xiong, S. Ren, and T. Zhu, "Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm," Knowledge-Based Systems, vol. 195, p. 105694, 2020, doi: 10.1016/j.knosys.2020.105694. [DOI:10.1016/j.knosys.2020.105694]
7. [7] E. S. Page, "Continuous inspection schemes," Biometrika, vol. 41, no. 1/2, pp. 100-115, 1954. [DOI:10.1093/biomet/41.1-2.100]
8. [8] D. Siegmund, Sequential analysis: tests and confidence intervals. Springer Science & Business Media, 2013.
9. [9] O. A. Mahdi, E. Pardede, and N. Ali, "KAPPA as Drift Detector in Data Stream Mining," Procedia Computer Science, vol. 184, pp. 314-321, 2021. [DOI:10.1016/j.procs.2021.03.040]
10. [10] J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with Drift Detection," Berlin, Heidelberg, 2004: Springer Berlin Heidelberg, in Advances in Artificial Intelligence - SBIA 2004, pp. 286-295. [DOI:10.1007/978-3-540-28645-5_29]
11. [11] M. Baena-Garcıa, J. del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-Bueno, "Early drift detection method," in Fourth international workshop on knowledge discovery from data streams, 2006, vol. 6, pp. 77-86.
12. [12] A. Bifet and R. Gavalda, "Learning from time-changing data with adaptive windowing," in Proceedings of the 2007 SIAM international conference on data mining, 2007: SIAM, pp. 443-448. [DOI:10.1137/1.9781611972771.42]
13. [13] P. Domingos and G. Hulten, "Mining high-speed data streams," in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000: ACM, pp. 71-80. [DOI:10.1145/347090.347107]
14. [14] G. Liu, H. Cheng, Z. Qin, Q. Liu, and C. Liu, "E-CVFDT: An improving CVFDT method for concept drift data stream," in 2013 International Conference on Communications, Circuits and Systems (ICCCAS), 2013, vol. 1, pp. 315-318, doi: 10.1109/ICCCAS.2013.6765241. [DOI:10.1109/ICCCAS.2013.6765241]
15. [15] S. A. Jadhav and S. Kosbatwar, "Concept-adapting Very Fast Decision Tree with Misclassification Error," 2016.
16. [16] I. Frias-Blanco, J. del Campo-Ávila, G. Ramos-Jimenez, R. Morales-Bueno, A. Ortiz-Diaz, and Y. Caballero-Mota, "Online and non-parametric drift detection methods based on Hoeffding's bounds," IEEE Transactions on Knowledge Data Engineering, vol. 27, no. 3, pp. 810-823, 2014. [DOI:10.1109/TKDE.2014.2345382]
17. [17] A. Pesaranghader and H. L. Viktor, "Fast hoeffding drift detection method for evolving data streams," in Joint European conference on machine learning and knowledge discovery in databases, 2016: Springer, pp. 96-111. [DOI:10.1007/978-3-319-46227-1_7]
18. [18] A. Pesaranghader, H. Viktor, and E. Paquet, "Reservoir of diverse adaptive learners and stacking fast hoeffding drift detection methods for evolving data streams," Machine Learning, vol. 107, no. 11, pp. 1711-1743, 2018. [DOI:10.1007/s10994-018-5719-z]
19. [19] Y. Yuan, Z. Wang, and W. Wang, "Unsupervised concept drift detection based on multi-scale slide windows," Ad Hoc Networks, vol. 111, p. 102325, 2021. [DOI:10.1016/j.adhoc.2020.102325]
20. [20] A. Feitosa Neto and A. M. P. Canuto, "EOCD: An ensemble optimization approach for concept drift applications," Information Sciences, vol. 561, pp. 81-100, 2021, doi: https://doi.org/10.1016/j.ins.2021.01.051 [DOI:10.1016/j.ins.2021.01.051.]
21. [21] D. H. Jeong and J. M. Lee, "Ensemble learning based latent variable model predictive control for batch trajectory tracking under concept drift," Computers & Chemical Engineering, vol. 139, p. 106875, 2020, doi: https://doi.org/10.1016/j.compchemeng.2020.106875 [DOI:10.1016/j.compchemeng.2020.106875.]
22. [22] W. Liu, H. Zhang, Z. Ding, Q. Liu, and C. Zhu, "A comprehensive active learning method for multiclass imbalanced data streams with concept drift," Knowledge-Based Systems, vol. 215, p. 106778, 2021, doi: https://doi.org/10.1016/j.knosys.2021.106778 [DOI:10.1016/j.knosys.2021.106778.]
23. [23] P. Zyblewski, R. Sabourin, and M. Woźniak, "Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams," Information Fusion, vol. 66, pp. 138-154, 2021, doi: 10.1016/j.inffus.2020.09.004. [DOI:10.1016/j.inffus.2020.09.004]
24. [24] J. Hamidzadeh and M. Moradi, "Improving Chernoff criterion for classification by using the filled function," jsdp, vol. 19, no. 3, pp. 105-118, 2022, doi: 10.52547/jsdp.19.3.105. [DOI:10.52547/jsdp.19.3.105]
25. [25] J. Pouramini, B. Minaei-Bidgoli, and M. Esmaeili, "A Novel One Sided Feature Selection Method for Imbalanced Text Classification," jsdp, vol. 16, no. 1, pp. 21-40, 2019, doi: 10.29252/jsdp.16.1.21. [DOI:10.29252/jsdp.16.1.21]
26. [26] E. Yasrebi Naeini and m. hatami, "Improving Imbalanced Data Classification Accuracy by using Fuzzy Similarity Measure and Subtractive Clustering," jsdp, vol. 19, no. 2, pp. 27-38, 2022, doi: 10.52547/jsdp.19.2.27. [DOI:10.52547/jsdp.19.2.27]
27. [27] Y. Wang, Y. Zhang, and Y. Wang, "Mining Data Streams with Skewed Distribution by Static Classifier Ensemble," in Opportunities and Challenges for Next-Generation Applied Intelligence, B.-C. Chien and T.-P. Hong Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 65-71. [DOI:10.1007/978-3-540-92814-0_11]
28. [28] S. Chen and H. He, "Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach," Evolving Systems, vol. 2, no. 1, pp. 35-50, 2011, doi: 10.1007/s12530-010-9021-y. [DOI:10.1007/s12530-010-9021-y]
29. [29] R. N. Lichtenwalter and N. V. Chawla, "Adaptive Methods for Classification in Arbitrarily Imbalanced and Drifting Data Streams," in New Frontiers in Applied Data Mining, Berlin, Heidelberg, T. Theeramunkong et al., Eds., 2010// 2010: Springer Berlin Heidelberg, pp. 53-75. [DOI:10.1007/978-3-642-14640-4_5]
30. [30] G. Ditzler and R. Polikar, "Incremental Learning of Concept Drift from Streaming Imbalanced Data," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 10, pp. 2283-2301, 2013, doi: 10.1109/TKDE.2012.136. [DOI:10.1109/TKDE.2012.136]
31. [31] R. R. Yager and L. Liu, Classic works of the Dempster-Shafer theory of belief functions. Springer, 2008. [DOI:10.1007/978-3-540-44792-4]
32. [32] M. A. A. Abdualrhman and M. Padma, "CD2A: Concept Drift Detection Approach Toward Imbalanced Data Stream," in Emerging Research in Electronics, Computer Science and Technology: Springer, 2019, pp. 597-612. [DOI:10.1007/978-981-13-5802-9_54]
33. [33] M. M. W. Yan, "Accurate detecting concept drift in evolving data streams," ICT Express, 2020.

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.