دوره 20، شماره 2 - ( 6-1402 )                   جلد 20 شماره 2 صفحات 98-81 | برگشت به فهرست نسخه ها


XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Fardin S, Hashemzadeh M. Outlier Detection on Data Streams Using a QLattice-based Model and Online Learning. JSDP 2023; 20 (2) : 6
URL: http://jsdp.rcisp.ac.ir/article-1-1226-fa.html
فردین سحر، هاشم‌زاده مهدی. تشخیص داده‌های پَرت در داده‌های جریانی با استفاده از مدل مبتنی بر QLattice و یادگیری آنلاین. پردازش علائم و داده‌ها. 1402; 20 (2) :81-98

URL: http://jsdp.rcisp.ac.ir/article-1-1226-fa.html


دانشگاه شهید مدنی آذربایجان
چکیده:   (471 مشاهده)
تشخیص داده‌های پَرت در جریان داده (داده‌های جریانی)، که ویژگی‌های خاصی نظیر نامحدود بودن و گذرا بودن را دارند، چالش‌های زیادی دارد. برای این منظور، در این پژوهش، یک رویکرد مبتنی بر مدل طبقه­بندی QLattice، که بر مبنای محاسبات کوانتوم کار می­کند و در کاربرد مورد هدف عملکرد بهتری نسبت به دیگر روش‌های طبقه‌بندی دارد، معرفی می‌کنیم. با توجه به امکان تغییر توزیع داده­ها در طول زمان در داده‌های جریانی، طرحی برای بهره‌گیری از یادگیری افزایشی آنلاین نیز در روش پیشنهادی ارائه می‌شود. با توجه به نامحدود بودن جریان داده­ها و حافظه­ی پردازشی محدود، فرآیند تشخیص بر روی پنجره‌ای از داده‌ها که همواره با داده‌های نمونه‌برداری شده از پنجره‌های قبلی به‌روزرسانی می‌شود، اعمال می‌گردد. تابعی نیز برای حل مشکل نامتوازن بودن داده­ها طراحی شده که از روش نمونه­برداری برای حل این مشکل بهره می­گیرد. نتایج آزمایشات نشان می­دهد که رویکرد پیشنهادی دقت عملکرد بهتری نسبت به روش­های دیگر دارد.
شماره‌ی مقاله: 6
متن کامل [PDF 1218 kb]   (123 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش داده‌های رقمی
دریافت: 1400/1/29 | پذیرش: 1401/2/21 | انتشار: 1402/7/30 | انتشار الکترونیک: 1402/7/30

فهرست منابع
1. [1] Y. Djenouri, D. Djenouri, and J. C.-W. Lin, "Trajectory Outlier Detection: New Problems and Solutions for Smart Cities," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 15, no. 2, pp. 1-28, 2021. [DOI:10.1145/3425867]
2. [2] A. Belhadi, Y. Djenouri, G. Srivastava, D. Djenouri, A. Cano, and J. C.-W. Lin, "A Two-Phase Anomaly Detection Model for Secure Intelligent Transportation Ride-Hailing Trajectories," IEEE Transactions on Intelligent Transportation Systems, 2020. [DOI:10.1109/TITS.2020.3022612]
3. [3] M. Hashemzadeh and A. Zademehdi, "Fire detection for video surveillance applications using ICA K-medoids-based color model and efficient spatio-temporal visual features," Expert Systems with Applications, vol. 130, pp. 60-78, 2019. [DOI:10.1016/j.eswa.2019.04.019]
4. [4] M. Hashemzadeh, G. Pan, and M. Yao, "Counting moving people in crowds using motion statistics of feature-points," Multimedia tools and applications, vol. 72, no. 1, pp. 453-487, 2014. [DOI:10.1007/s11042-013-1367-2]
5. [5] M. Hashemzadeh, G. Pan, Y. Wang, M. Yao, and J. Wu, "Combining velocity and location-specific spatial clues in trajectories for counting crowded moving objects," International Journal of Pattern Recognition and Artificial Intelligence, vol. 27, no. 02, p. 1354003, 2013. [DOI:10.1142/S0218001413540037]
6. [6] M. Hashemzadeh and N. Farajzadeh, "Combining keypoint-based and segment-based features for counting people in crowded scenes," Information Sciences, vol. 345, pp. 199-216, 2016. [DOI:10.1016/j.ins.2016.01.060]
7. [7] N. Farajzadeh, A. Karamiani, and M. Hashemzadeh, "A fast and accurate moving object tracker in active camera model," Multimedia Tools and Applications, vol. 77, no. 6, pp. 6775-6797, 2018. [DOI:10.1007/s11042-017-4597-x]
8. [8] S. Sadik and L. Gruenwald, "Research issues in outlier detection for data streams," Acm Sigkdd Explorations Newsletter, vol. 15, no. 1, pp. 33-40, 2014. [DOI:10.1145/2594473.2594479]
9. [9] J. Han, M. Kamber, and J. Pei, "Data mining: concepts and techniques, Waltham, MA," Morgan Kaufman Publishers, vol. 10, pp. 978-1, 2012.
10. [10] D. M. Hawkins, Identification of outliers. Springer, 1980. [DOI:10.1007/978-94-015-3994-4]
11. [11] S. Mehta, "Concept drift in streaming data classification: Algorithms, Platforms and issues," Procedia computer science, vol. 122, pp. 804-811, 2017. [DOI:10.1016/j.procs.2017.11.440]
12. [12] V. Hodge and J. Austin, "A survey of outlier detection methodologies," Artificial intelligence review, vol. 22, no. 2, pp. 85-126, 2004. [DOI:10.1023/B:AIRE.0000045502.10941.a9]
13. [13] M. Singh and R. Pamula, "ADINOF: adaptive density summarizing incremental natural outlier detection in data stream," Neural Computing and Applications, pp. 1-17, 2021. [DOI:10.1007/s00521-021-05725-0]
14. [14] M. Gupta, J. Gao, C. C. Aggarwal, and J. Han, "Outlier detection for temporal data: A survey," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 9, pp. 2250-2267, 2013. [DOI:10.1109/TKDE.2013.184]
15. [15] Y. Yang, L. Chen, and C. Fan, "ELOF: fast and memory-efficient anomaly detection algorithm in data streams," Soft Computing, pp. 1-12, 2020. [DOI:10.1007/s00500-020-05442-1]
16. [16] L. Chen, W. Wang, and Y. Yang, "CELOF: Effective and fast memory efficient local outlier detection in high-dimensional data streams," Applied Soft Computing, vol. 102, p. 107079, 2021. [DOI:10.1016/j.asoc.2021.107079]
17. [17] S. Thudumu, P. Branch, J. Jin, and J. J. Singh, "A comprehensive survey of anomaly detection techniques for high dimensional big data," Journal of Big Data, vol. 7, no. 1, pp. 1-30, 2020. [DOI:10.1186/s40537-020-00320-x]
18. [18] M. V. Joshi, R. C. Agarwal, and V. Kumar, "Mining needle in a haystack: classifying rare classes via two-phase rule induction," in Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001, pp. 91-102. [DOI:10.1145/375663.375673] [PMID]
19. [19] S. Hawkins, H. He, G. Williams, and R. Baxter, "Outlier detection using replicator neural networks," in International Conference on Data Warehousing and Knowledge Discovery, 2002: Springer, pp. 170-180. [DOI:10.1007/3-540-46145-0_17]
20. [20] M. U. Togbe et al., "Anomaly Detection for Data Streams Based on Isolation Forest Using Scikit-Multiflow," in International Conference on Computational Science and Its Applications, 2020: Springer, pp. 15-30. [DOI:10.1007/978-3-030-58811-3_2]
21. [21] G. Han, J. Tu, L. Liu, M. Martínez-García, and Y. Peng, "Anomaly Detection Based on Multidimensional Data Processing for Protecting Vital Devices in 6G-Enabled Massive IIoT," IEEE Internet of Things Journal, vol. 8, no. 7, pp. 5219-5229, 2021. [DOI:10.1109/JIOT.2021.3051935]
22. [22] N. M. R. SURI and G. Athithan, Outlier detection: techniques and applications. Springer, 2019.
23. [23] K. Yamanishi, J.-I. Takeuchi, G. Williams, and P. Milne, "On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms," Data Mining and Knowledge Discovery, vol. 8, no. 3, pp. 275-300, 2004. [DOI:10.1023/B:DAMI.0000023676.72185.7c]
24. [24] C. C. Aggarwal, S. Y. Philip, J. Han, and J. Wang, "A framework for clustering evolving data streams," in Proceedings 2003 VLDB conference, 2003: Elsevier, pp. 81-92. [DOI:10.1016/B978-012722442-8/50016-1] [PMID]
25. [25] I. Assent, P. Kranen, C. Baldauf, and T. Seidl, "Anyout: Anytime outlier detection on streaming data," in International Conference on Database Systems for Advanced Applications, 2012: Springer, pp. 228-242. [DOI:10.1007/978-3-642-29038-1_18]
26. [26] F. Angiulli and F. Fassetti, "Detecting distance-based outliers in streams of data," in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, 2007, pp. 811-820. [DOI:10.1145/1321440.1321552]
27. [27] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, "LOF: identifying density-based local outliers," in ACM sigmod record, 2000, vol. 29, no. 2: ACM, pp. 93-104. [DOI:10.1145/335191.335388]
28. [28] G. S. Na, D. Kim, and H. Yu, "DILOF: Effective and memory efficient local outlier detection in data streams," in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1993-2002.
29. [29] M. Salehi, C. Leckie, J. C. Bezdek, T. Vaithianathan, and X. Zhang, "Fast memory efficient local outlier detection in data streams," IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 12, pp. 3246-3260, 2016. [DOI:10.1109/TKDE.2016.2597833]
30. [30] J. Gao, W. Ji, L. Zhang, A. Li, Y. Wang, and Z. Zhang, "Cube-based incremental outlier detection for streaming computing," Information Sciences, vol. 517, pp. 361-376, 2020. [DOI:10.1016/j.ins.2019.12.060]
31. [31] X. Qin, L. Cao, E. A. Rundensteiner, and S. Madden, "Scalable Kernel Density Estimation-based Local Outlier Detection over Large Data Streams," in EDBT, 2019, pp. 421-432.
32. [32] F. T. Liu, K. M. Ting, and Z.-H. Zhou, "Isolation forest," in 2008 Eighth IEEE International Conference on Data Mining, 2008: IEEE, pp. 413-422. [DOI:10.1109/ICDM.2008.17]
33. [33] Z. Ding and M. Fei, "An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window," IFAC Proceedings Volumes, vol. 46, no. 20, pp. 12-17, 2013. [DOI:10.3182/20130902-3-CN-3020.00044]
34. [34] S. C. Tan, K. M. Ting, and T. F. Liu, "Fast anomaly detection for streaming data," in Twenty-Second International Joint Conference on Artificial Intelligence, 2011.
35. [35] S. Ahmad, A. Lavin, S. Purdy, and Z. Agha, "Unsupervised real-time anomaly detection for streaming data," Neurocomputing, vol. 262, pp. 134-147, 2017. [DOI:10.1016/j.neucom.2017.04.070]
36. [36] A. Tuor, S. Kaplan, B. Hutchinson, N. Nichols, and S. Robinson, "Deep learning for unsupervised insider threat detection in structured cybersecurity data streams," arXiv preprint arXiv:1710.00811, 2017.
37. [37] B. V. Ashok, "QLattice Environment and Feyn QGraph Models - A new Perspective towards Deep Learning," Zenodo, 2020.
38. [38] M. Machado. "A new kind of AI." https://medium.com/abzuai/a-new-kind-of-ai-7665f8198877 (accessed.
39. [39] K. B. T. Jelen. https://docs.abzu.ai/docs/guides/qlattice.html (accessed.
40. [40] C. Cave. "Opening the black box." https://medium.com/abzuai/opening-the-black-box-247a63ce553e (accessed.
41. [41] J. Brownlee, "Why one-hot encode data in machine learning," Machine Learning Mastery, 2017.
42. [42] M. DelSole. "What is One Hot Encoding and How to Do It." https://medium.com/@michaeldelsole/what-is-one-hot-encoding-and-how-to-do-it-f0ae272f1179 (accessed.
43. [43] A. Fernández, S. Garcia, F. Herrera, and N. V. Chawla, "SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary," Journal of artificial intelligence research, vol. 61, pp. 863-905, 2018. [DOI:10.1613/jair.1.11192]
44. [44] P. Soltanzadeh and M. Hashemzadeh, "RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problem," Information Sciences, vol. 542, pp. 92-111, 2021. [DOI:10.1016/j.ins.2020.07.014]
45. [45] "Overview of Online Machine Learning in Big Data Streams," in Encyclopedia of Big Data Technologies, S. Sakr and A. Y. Zomaya Eds. Cham: Springer International Publishing, 2019, pp. 1239-1239. [DOI:10.1007/978-3-319-77525-8_100249]
46. [46] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, "A detailed analysis of the KDD CUP 99 data set," in 2009 IEEE symposium on computational intelligence for security and defense applications, 2009: IEEE, pp. 1-6. [DOI:10.1109/CISDA.2009.5356528]
47. [47] "Strategies to scale computationally: bigger data." https://scikit-learn.org/0.15/modules/scaling_strategies.html (accessed 2018).

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.