مروری بر آسیب‌پذیری شبکه‌های عصبی عمیق نسبت به نمونه‌های خصمانه و رویکرد‌های مقابله با آن‌ها

خالوئی, محمد; همایون پور, محمد مهدی; امیرمزلقانی, مریم

doi:10.61186/jsdp.20.2.113

دوره 20، شماره 2 - ( 6-1402 ) جلد 20 شماره 2 صفحات 144-113 | برگشت به فهرست نسخه ها

‎ 10.61186/jsdp.20.2.113

Mendeley

Zotero

RefWorks

khalooei M, Homayounpour M M, Amirmazlaghani M. A survey on vulnerability of deep neural networks to adversarial examples and defense approaches to deal with them. JSDP 2023; 20 (2) : 8
URL: http://jsdp.rcisp.ac.ir/article-1-1205-fa.html

خالوئی محمد، همایون پور محمد مهدی، امیرمزلقانی مریم. مروری بر آسیب‌پذیری شبکه‌های عصبی عمیق نسبت به نمونه‌های خصمانه و رویکرد‌های مقابله با آن‌ها. پردازش علائم و داده‌ها. 1402; 20 (2) :113-144

URL: http://jsdp.rcisp.ac.ir/article-1-1205-fa.html

مروری بر آسیب‌پذیری شبکه‌های عصبی عمیق نسبت به نمونه‌های خصمانه و رویکرد‌های مقابله با آن‌ها

محمد خالوئی

، محمد مهدی همایون پور^*

، مریم امیرمزلقانی

دانشگاه صنعتی امیرکبیردانشگاه صنعتی امیرکبیر

چکیده: (1632 مشاهده)

امروزه شبکه‌های عصبی به‌عنوان بارزترین ابزار مطرح در هوش‌مصنوعی و یادگیری ماشین شناخته شده‌ و در حوزه‌های مالی و بانکداری، کسب ‌و کار، تجارت، سلامت، پزشکی، بیمه، رباتیک، هواپیمایی، خودرو، نظامی و سایر حوزه‌ها مورد استفاده قرار می‌گیرند. در سال‌های اخیر موارد متعددی از آسیب‌پذیری شبکه‌های عصبی عمیق نسبت به حملاتی مطرح شده که غالباً با افزودن اختلالات جمع‌شونده و غیر جمع‌شونده بر داده ورودی ایجاد می‌شوند. این اختلالات با وجود نامحسوس بودن در ورودی از دیدگاه عامل انسانی، خروجی شبکه آموزش دیده را تغییر می‌دهند. به اقداماتی که شبکه‌های عصبی عمیق را نسبت به حملات مقاوم‌ می‌نمایند، دفاع اطلاق می‌شود. برخی از روش‌های حمله مبتنی بر ابزارهایی نظیر گرادیان شبکه نسبت به ورودی، به دنبال شناسایی اختلال می‌باشند و برخی دیگر به تخمین آن ابزارها می‌پردازند و در تلاش هستند تا حتی بدون داشتن اطلاعاتی از آن‌ها، به اطلاعات آن‌ها دست پیدا کنند. رویکردهای دفاع نیز برخی روی تعریف تابع هزینه بهینه و همچنین معماری شبکه مناسب و برخی دیگر بر جلوگیری و یا اصلاح داده قبل از ورود به شبکه متمرکز می‌شوند. همچنین برخی رویکردها به تحلیل میزان مقاوم‌بودن شبکه نسبت به این حملات و ارائه محدوده اطمینان متمرکز شده‌اند. در این مقاله سعی شده است تا جدیدترین پژوهش‌ها در زمینه آسیب‌پذیری شبکه‌های عصبی عمیق بررسی و مورد نقد قرار گیرند و کارایی آن‌ها با انجام آزمایش‌هایی مقایسه شود. در آزمایشات صورت گرفته در بین حملات محصور‌شده به l∞ و l2 ، روش AutoAttack کارایی بسیار بالایی دارد. البته باتوجه به برتری روش‌ AutoAttack نسبت به روش‌هایی نظیر MIFGSM، PGD و DeepFool این روش برای اجرا، مدت زمان بیشتری به خاطر ترکیبی بودن ساختار درونی آن نسبت به سایر روش‌های همردیف خود نیاز دارد. همچنین به مقایسه برخی از رویکردهای پرکاربرد دفاع در مقابل نمونه‌های خصمانه نیز پرداخته شد و از بین روش‌های مبتنی بر نواحی محصورشده به l∞ حول داده، روش آموزش خصمانه مبتنی بر مشتقات PGD با پارامترهای مشخص، از سایر روش‌ها بهتر در مقابل اغلب روش‌های حمله مقاوم بوده است. لازم به ذکر است که روش‌های مختلف حمله خصمانه و دفاع نسبت به آن حملات که در این مقاله مورد بررسی قرار گرفت است در یک قالب مناسب و منعطف کدنویسی شده است. این قالب کدنویسی به عنوان یک پشتوانه پایدار ویژه تحقیق و پژوهش در حوزه یادگیری ماشین استاندارد و یادگیری ماشین خصمانه ویژه پژوهشگران و علاقه‌مندان از طریق آدرسhttps://github.com/khalooei/Robustness-framework در دسترس قرار گرفته است.

شماره‌ی مقاله: 8

واژه‌های کلیدی: آسیب‌پذیری شبکه‌های عصبی، مقاوم‌سازی، حمله، دفاع، شبکه‌های عصبی

متن کامل [PDF 1990 kb] (702 دریافت)

نوع مطالعه: كاربردي | موضوع مقاله: مقالات پردازش تصویر
دریافت: 1399/10/30 | پذیرش: 1402/4/14 | انتشار: 1402/7/30 | انتشار الکترونیک: 1402/7/30

فهرست منابع

1. [1] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.

2. [2] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. [DOI:10.1038/nature14539] [PMID]

3. [3] A. H. Marblestone, G. Wayne, and K. P. Kording, "Toward an integration of deep learning and neuroscience," Frontiers in computational neuroscience, vol. 10, p. 94, 2016. [DOI:10.3389/fncom.2016.00094] [PMID] []

4. [4] S. Ganguli, "Towards bridging the gap between neuroscience and artificial intelligence." [Online]. Available: https://cbmm.mit.edu/sites/default/files/documents/Ganguli_AAAI17_SoI.pdf. [Accessed: 01-Dec-2019].

5. [5] Y. LeCun and Y. Bengio, "The Handbook of Brain Theory and Neural Networks," M. A. Arbib, Ed. Cambridge, MA, USA: MIT Press, 1998, pp. 255-258.

6. [6] A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay, and D. Mukhopadhyay, "Adversarial Attacks and Defences: A Survey," 2018. [Online]. Available: http://arxiv.org/abs/1810.00069. [Accessed: 17-Aug-2019].

7. [7] A. D. Joseph, B. Nelson, B. I. P. Rubinstein, and J. D. Tygar, Adversarial machine learning. Cambridge University Press.

8. [8] I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and Harnessing Adversarial Examples," in Proceedings of the International Conference on Learning Representations (ICLR), 2015.

9. [9] C. Szegedy et al., "Intriguing properties of neural networks," in Proceedings of the International Conference on Learning Representations (ICLR), 2014.

10. [10] A. Boloor, X. He, C. Gill, Y. Vorobeychik, and X. Zhang, "Simple Physical Adversarial Examples against End-to-End Autonomous Driving Models," in 2019 IEEE International Conference on Embedded Software and Systems (ICESS), 2019, pp. 1-7. [DOI:10.1109/ICESS.2019.8782514]

11. [11] A. Kurakin, I. Goodfellow, and S. Bengio, "Adversarial Machine Learning at Scale," in Proceedings of the International Conference on Learning Representations (ICLR), 2017.

12. [12] N. Akhtar and A. Mian, "Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey," IEEE Access, vol. 6, pp. 14410-14430, 2018. [DOI:10.1109/ACCESS.2018.2807385]

13. [13] S. Kariyappa and M. K. Qureshi, "Improving Adversarial Robustness of Ensembles with Diversity Training," 2019. [Online]. Available: http://arxiv.org/abs/1901.09981. [Accessed: 07-Oct-2019].

14. [14] A. Kurakin, I. Goodfellow, and S. Bengio, "Adversarial examples in the physical world," in Proceedings of the International Conference on Learning Representations (ICLR), 2017. [DOI:10.1201/9781351251389-8]

15. [15] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, "DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2574-2582. [DOI:10.1109/CVPR.2016.282]

16. [16] N. Carlini and D. Wagner, "Towards Evaluating the Robustness of Neural Networks," in Proceedings of the IEEE Symposium on Security and Privacy (SP), 2017, pp. 39-57. [DOI:10.1109/SP.2017.49]

17. [17] S. M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, "Universal adversarial perturbations," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [DOI:10.1109/CVPR.2017.17]

18. [18] J. Wu and R. Fu, "Universal, transferable and targeted adversarial attacks," 2019. [Online]. Available: http://arxiv.org/abs/1908.11332. [Accessed: 31-Dec-2019].

19. [19] Y. Dong et al., "Boosting Adversarial Attacks With Momentum," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [DOI:10.1109/CVPR.2018.00957]

20. [20] S. Qiu, Q. Liu, S. Zhou, and C. Wu, "Review of Artificial Intelligence Adversarial Attack and Defense Technologies," Applied Sciences, vol. 9, no. 5, p. 909, 2019. [DOI:10.3390/app9050909]

21. [21] F. Assion et al., "The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [DOI:10.1109/CVPRW.2019.00177]

22. [22] D. C. Liu and J. Nocedal, "On the limited memory BFGS method for large scale optimization," Mathematical Programming, vol. 45, no. 1-3, pp. 503-528, 1989. [DOI:10.1007/BF01589116]

23. [23] T. Miyato, S.-I. Maeda, M. Koyama, and S. Ishii, "Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 8, pp. 1979-1993, 2019. [DOI:10.1109/TPAMI.2018.2858821] [PMID]

24. [24] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, "Towards Deep Learning Models Resistant to Adversarial Attacks," in Proceedings of the International Conference on Learning Representations (ICLR), 2018.

25. [25] S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet, "Adversarial Manipulation of Deep Representations," in Proceedings of the International Conferenceon Learning Representations (ICLR), 2016.

26. [26] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, "The Limitations of Deep Learning in Adversarial Settings," in Proceedings of the IEEE European Symposium on Security and Privacy (EuroS&P), 2016, pp. 372-387. [DOI:10.1109/EuroSP.2016.36]

27. [27] J. Su, D. V. Vargas, and K. Sakurai, "One Pixel Attack for Fooling Deep Neural Networks," IEEE Transactions on Evolutionary Computation, vol. 23, no. 5, pp. 828-841, 2019. [DOI:10.1109/TEVC.2019.2890858]

28. [28] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, "Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks," in Proceedings of the IEEE Symposium on Security and Privacy (SP), 2016, pp. 582-597. [DOI:10.1109/SP.2016.41]

29. [29] P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh, "ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models," in Proceedings of the ACM Workshop on Artificial Intelligence and Security (AISec), 2017, pp. 15-26. [DOI:10.1145/3128572.3140448]

30. [30] L. Rosasco, E. De Vito, A. Caponnetto, M. Piana, and A. Verri, "Are Loss Functions All the Same?," Neural Computation, vol. 16, no. 5, pp. 1063-1076, 2004. [DOI:10.1162/089976604773135104] [PMID]

31. [31] Z. Zhao, D. Dua, and S. Singh, "Generating Natural Adversarial Examples," Proceedings of the International Conference on Learning Representations (ICLR), 2018.

32. [32] M. Arjovsky, S. Chintala, and L. Bottou, "Wasserstein Generative Adversarial Networks," in Proceedings of the International Conference on Machine Learning (ICML), 2017, vol. 70, pp. 214-223.

33. [33] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, "Improved Training of Wasserstein GANs," in Advances in Neural Information Processing Systems (NIPS), 2017, pp. 5767-5777.

34. [34] M. Arjovsky and L. Bottou, "Towards Principled Methods for Training Generative Adversarial Networks," Proceedings of the International Conference on Learning Representations (ICLR), 2017.

35. [35] T. Salimans et al., "Improved Techniques for Training GANs," in Advances in Neural Information Processing Systems (NIPS), 2016, pp. 2234-2242.

36. [36] M. Rosca, B. Lakshminarayanan, D. Warde-Farley, and S. Mohamed, "Variational Approaches for Auto-Encoding Generative Adversarial Networks," 2017. [Online]. Available: http://arxiv.org/abs/1706.04987.

37. [37] X. Yuan, P. He, Q. Zhu, and X. Li, "Adversarial Examples: Attacks and Defenses for Deep Learning," IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 9, pp. 2805-2824, 2019. [DOI:10.1109/TNNLS.2018.2886017] [PMID]

38. [38] D. Stutz, M. Hein, and B. Schiele, "Disentangling Adversarial Robustness and Generalization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [DOI:10.1109/CVPR.2019.00714]

39. [39] A. Mustafa, S. Khan, M. Hayat, R. Goecke, J. Shen, and L. Shao, "Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks," in The IEEE International Conference on Computer Vision (ICCV), 2019. [DOI:10.1109/ICCV.2019.00348]

40. [40] G. Tao, S. Ma, Y. Liu, and X. Zhang, "Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples," in Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds. Curran Associates, Inc., 2018, pp. 7717-7728.

41. [41] E. Wong, L. Rice, and J. Z. Kolter, "Fast is better than free: Revisiting adversarial training," 2019.

42. [42] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, "Ensemble Adversarial Training: Attacks and Defenses," Proceedings of the International Conference on Learning Representations (ICLR), 2018.

43. [43] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, "Practical Black-Box Attacks against Machine Learning," in Proceedings of the ACM on Asia Conference on Computer and Communications Security (ASIA CCS), 2017, pp. 506-519. [DOI:10.1145/3052973.3053009]

44. [44] H. Hosseini, Y. Chen, S. Kannan, B. Zhang, and R. Poovendran, "Blocking Transferability of Adversarial Examples in Black-Box Learning Systems," 2017. [Online]. Available: http://arxiv.org/abs/1703.04318.

45. [45] G. K. Dziugaite, Z. Ghahramani, and D. M. Roy, "A study of the effect of JPG compression on adversarial images," 2016. [Online]. Available: http://arxiv.org/abs/1608.00853. [Accessed: 01-Oct-2019].

46. [46] N. Das et al., "Keeping the Bad Guys Out: Protecting and Vaccinating Deep Learning with JPEG Compression," 2017. [Online]. Available: http://arxiv.org/abs/1705.02900. [Accessed: 01-Oct-2019].

47. [47] N. Akhtar, J. Liu, and A. Mian, "Defense Against Universal Adversarial Perturbations," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3389-3398. [DOI:10.1109/CVPR.2018.00357]

48. [48] C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, "Adversarial Examples for Semantic Segmentation and Object Detection," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1378-1387. [DOI:10.1109/ICCV.2017.153] [PMID] []

49. [49] A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, "Adversarial Examples Are Not Bugs, They Are Features," in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, pp. 125-136.

50. [50] D. Meng and H. Chen, "MagNet," in Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2017, pp. 135-147. [DOI:10.1145/3133956.3134057]

51. [51] B. Biggio, B. Nelson, and P. Laskov, "Support Vector Machines Under Adversarial Label Noise," in Proceedings of the Asian Conference on Machine Learning, 2011, pp. 97-112.

52. [52] S.-I. Mirzadeh, M. Farajtabar, A. Li, and H. Ghasemzadeh, "Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher," in Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2020. [DOI:10.1609/aaai.v34i04.5963]

53. [53] N. Carlini and D. Wagner, "Defensive Distillation is Not Robust to Adversarial Examples," eprint arXiv:1607.04311, 2016. [Online]. Available: http://arxiv.org/abs/1607.04311. [Accessed: 10-Oct-2019].

54. [54] W. Xu, D. Evans, and Y. Qi, "Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks," Network and Distributed Systems Security Symposium (NDSS) 2018, 2018. [DOI:10.14722/ndss.2018.23198]

55. [55] J. Gao, B. Wang, Z. Lin, W. Xu, and Y. Qi, "DeepCloak: Masking Deep Neural Network Models for Robustness Against Adversarial Samples," Proceedings of the International Conference on Learning Representations (ICLR), 2017.

56. [56] S. Gu and L. Rigazio, "Towards deep neural network architectures robust to adversarial examples," in Proceedings of the International Conference on Learning Representations (ICLR) Workshop, 2015.

57. [57] P. Samangouei, M. Kabkab, and R. Chellappa, "Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models," Proceedings of the International Conference on Learning Representations (ICLR), 2018.

58. [58] I. Goodfellow et al., "Generative Adversarial Nets," in Advances in Neural Information Processing Systems (NIPS), 2014, pp. 2672-2680.

59. [59] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, "Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018. [DOI:10.1109/CVPR.2018.00191] [PMID]

60. [60] J. Uesato, B. O'Donoghue, P. Kohli, and A. Oord, "Adversarial Risk and the Dangers of Evaluating Against Weak Attacks," in Proceedings of Machine Learning Research, 2018, pp. 5025-5034.

61. [61] B. Sun, N.-H. Tsai, F. Liu, R. Yu, and H. Su, "Adversarial Defense by Stratified Convolutional Sparse Coding," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [DOI:10.1109/CVPR.2019.01171]

62. [62] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry, "Robustness May Be at Odds with Accuracy," Proceedings of the International Conference on Learning Representations (ICLR), 2018.

63. [63] D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao, "Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models," Springer, Cham, 2018, pp. 644-661. [DOI:10.1007/978-3-030-01258-8_39]

64. [64] F. Tramèr, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, "The Space of Transferable Adversarial Examples," 2017. [Online]. Available: http://arxiv.org/abs/1704.03453. [Accessed: 20-Oct-2019].

65. [65] X. Wang et al., "Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting," in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019, vol. 2019-May, pp. 6366-6370. [DOI:10.1109/ICASSP.2019.8683479]

66. [66] L. Schönherr, K. Kohls, S. Zeiler, T. Holz, and D. Kolossa, "Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding," in Network and Distributed System Security Symposium (NDSS), 2019. [DOI:10.14722/ndss.2019.23288]

67. [67] J. Ebrahimi, A. Rao, D. Lowd, and D. Dou, "Hotflip: White-box adversarial examples for text classification," in ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), 2018, vol. 2, pp. 31-36. [DOI:10.18653/v1/P18-2006] [PMID] []

68. [68] C. C. and C. B. Yann LeCun, "MNIST handwritten digit database." [Online]. Available: http://yann.lecun.com/exdb/mnist/. [Accessed: 24-Jun-2019].

69. [69] and G. H. Alex Krizhevsky, Vinod Nair, "CIFAR-10 and CIFAR-100 datasets," 2009. [Online]. Available: https://www.cs.toronto.edu/~kriz/cifar.html. [Accessed: 19-Oct-2019].

70. [70] S. Zagoruyko and N. Komodakis, "Wide Residual Networks," in Procedings of the British Machine Vision Conference 2016, 2016, vol. 2016-September, pp. 87.1-87.12. [DOI:10.5244/C.30.87]

71. [71] G. F. Silva, "CNN - Digit Recognizer (PyTorch) | Kaggle," Kaggle.com, 2018. [Online]. Available: https://www.kaggle.com/gustafsilva/cnn-digit-recognizer-pytorch. [Accessed: 14-Dec-2020].

72. [72] A. Paszke et al., "Automatic differentiation in PyTorch," 2017.

73. [73] R. (Roger) Fletcher, Practical methods of optimization. Wiley, 1987.

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.

نظر شما در مورد قالب جدید چیست؟
	خوب
	متوسط
	ضعیف

پایگاه‌های مرتبط

واژگان کلیدی

نظرسنجی