به‌کارگیری نظریه ساختار بلاغی برای بهبود بازنمایی متن با شبکه‌های عصبی عمیق

غروی, عرفانه; ویسی, هادی

دوره 20، شماره 1 - ( 3-1402 ) جلد 20 شماره 1 صفحات 197-181 | برگشت به فهرست نسخه ها

به‌کارگیری نظریه ساختار بلاغی برای بهبود بازنمایی متن با شبکه‌های عصبی عمیق

عرفانه غروی

، هادی ویسی^*

دانشگاه تهران

چکیده: (490 مشاهده)

یافتن یک بازنمایی معنایی غنی با ابعاد کم برای متون طولانی یکی از چالشهای اساسی در فعالیتهای مختلف پردازش زبان طبیعی به شمار میرود. این بازنمایی باید اطلاعات معنایی و نحوی متن را در برگرفته و همچنین بر حسب وظیفه مد نظر ارتباط و تشابه متون را در ابعاد کم مدل‌سازی کند. در این مقاله تلاش بر آن است تا با بهره‌گیری از نظریه ساختار بلاغی و شبکه‌های عصبی عمیق چالشهای مطرح شده مرتفع گردد. نظریه ساختار بلاغی با ارائه یک ساختار سلسله مراتبی به توصیف اهمیت عبارات موجود در متن و روابط بین آن‌ها می‌پردازد. در اینجا تأثیر به‌کارگیری این ساختار درختی بر دو وظیفه بازیابی اطلاعات و تحلیل احساسات بررسی شده‌است. در وظیفه بازیابی اطلاعات، جهت مدلسازی وابستگی معنایی بین مستندات، یادگیری بازنمایی سند توسط شبکه‌های عصبی بازگشتی عمیق دوقلو صورت پذیرفت. بطوریکه ذخیره و بازیابی مستندات متنی تسهیل گردد. این شبکه از دو زیرشبکه بازگشتی عمیق تشکیل شدهاست. این شبکههای بازگشتی، مبتنی بر ساختار درختی حاصل از تجزیه متن توسط نظریه ساختار بلاغی می‌باشند. این متدلوژی بر روی دو مجموعه داده خبری شامل اخبار بیبیسی و همچنین زیرمجموعهای از دادگان رویترز مورد ارزیابی قرار گرفت. نتایج نشان میدهد بازنمایی ارائه شده توسط این ساختار، کارآیی بالاتری از بازنماییهای سنتی مبتنی بر سبد کلمه دارد. این رویکرد کارایی را به میزان ۶٪ بر روی مجموعه داده بی‌بی‌سی و ۳٪ بر روی مجموعه داده رویترز نسبت به بهترین روش کلاسیک بهبود داده‌است. در وظیفه تحلیل احساسات، در ابتدا به کمک شبکه عصبی بازگشتی عمیق مبتنی بر درخت ساختار بلاغی به ایجاد بازنمایی و در نهایت دسته‌بندی احساسات نظرات افراد پرداخته شد. سپس سایر اطلاعات موجود در درخت جهت بهبود مدل مورد استفاده قرار گرفت. این اطلاعات شامل آگاهی از اهمیت هر بخش از متن با استفاده از درخت ساختار بلاغی میباشد. با تشخیص بخشهای مرکزی متن و اعمال مکانیزم توجه بر آن در شبکه عمیق بازگشتی بازنمایی غنی‌تری برای متن ایجاد میگردد. این بازنمایی کارایی مدل تحلیل احساسات را بر روی دادگان اینترنتی نظرات بینندگان فیلم در مقایسه با روشهای پایه به میزان ۳٪ افزایش داده است. نتایج حاصل از این بررسی، بهبود بازنمایی متن با استفاده از شبکههای عمیق مبتنی بر نظریه ساختار بلاغی را نشان میدهد. بهبود بازنمایی به کمک ساختاردهی متن غیر ساختار یافته بر روی زبانهای دیگر از جمله زبان فارسی میتواند مورد راستی آزمایی قرار بگیرد.

شماره‌ی مقاله: 12

واژه‌های کلیدی: بازنمایی متن، نظریه ساختار بلاغی، شبکه های عصبی عمیق، شبکه های دوقلو، مکانیزم توجه

متن کامل [PDF 1659 kb] (133 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش متن
دریافت: 1398/2/8 | پذیرش: 1400/9/15 | انتشار: 1402/5/22 | انتشار الکترونیک: 1402/5/22

فهرست منابع

1. [1] R. Socher, C. D. C. Manning, and A. Y. A. Ng, "Learning continuous phrase representations and syntactic parsing with recursive neural networks," Proc. NIPS-2010 Deep Learn. Unsupervised Featur. Learn. Work., pp. 1-9, 2010.

2. [2] R. Socher, C. Manning, B. Huval, and A. Ng, "Semantic compositionality through recursive matrix-vector spaces," in EMNLP-CoNLL '12: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 1201-1211.

3. [3] Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, "A Neural Probabilistic Language Model," J. Mach. Learn. Res., vol. 3, pp. 1137-1155, 2003.

4. [4] K. S. Tai, R. Socher, and C. D. Manning, "Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks," Proc. 53rd Annu. Meet. Assoc. Comput. Linguist. 7th Int. Jt. Conf. Nat. Lang. Process., pp. 1556-1566, 2015. [DOI:10.3115/v1/P15-1150]

5. [5] N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A Convolutional Neural Network for Modelling Sentences," Acl. pp. 655-665, 2014. [DOI:10.3115/v1/P14-1062]

6. [6] Q. V. Le and T. Mikolov, "Distributed Representations of Sentences and Documents," vol. 32, pp. 1188-1196, 2014.

7. [7] J. Märkle-Huß, S. Feuerriegel, and H. Prendinger, "Improving Sentiment Analysis with Document-Level Semantic Relationships from Rhetoric Discourse Structures," in Proceedings of the 50th Hawaii International Conference on System Sciences, 2017, pp. 1142-1151. [DOI:10.24251/HICSS.2017.135]

8. [8] A. Hogenboom, F. Frasincar, F. de Jong, and U. Kaymak, "Using rhetorical structure in sentiment analysis," Commun. ACM, vol. 58, no. 7, pp. 69-77, 2015. [DOI:10.1145/2699418]

9. [9] D. Marcu, "Discourse Trees are Good Indicators of Importance in Text," in Advances in Automatic Text Summarization, 1999, pp. 123-136.

10. [10] W. C. Mann and S. A. Thompson, "Rhetorical Structure Theory: Toward a functional theory of text organization," Text, vol. 8, no. 3, pp. 243-281, 1988. [DOI:10.1515/text.1.1988.8.3.243]

11. [11] D. Noel, Towards a functional characterization of the news of the BBC World Service. 1986.

12. [12] B. A. Fox, Discourse Structure and Anaphora: Written and Conversational English. Cambridge University Press, 1993.

13. [13] R. Salakhutdinov and G. Hinton, "Semantic hashing," Int. J. Approx. Reason., vol. 50, no. 7, pp. 969-978, 2009. [DOI:10.1016/j.ijar.2008.11.006]

14. [14] Q. Wang, D. Zhang, and L. Si, "Semantic hashing using tags and topic modeling," in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13, 2013, p. 213. [DOI:10.1145/2484028.2484037]

15. [15] M. A. Livermore, F. Dadgostari, M. Guim, P. Beling, and D. Rockmore, "Law Search as Prediction," Virginia Public Law Leg. Theory Res. Pap., no. 2018-61, 2018.

16. [16] P. Huang et al., "Learning Deep Structured Semantic Models for Web Search using Clickthrough Data," 22nd ACM Int. Conf. Conf. Inf. Knowl. Manag., pp. 2333-2338, 2013. [DOI:10.1145/2505515.2505665]

17. [17] J. Mueller, "Siamese Recurrent Architectures for Learning Sentence Similarity," Proc. 30th Conf. Artif. Intell. (AAAI 2016), no. 2012, pp. 2786-2792, 2016. [DOI:10.1609/aaai.v30i1.10350]

18. [18] C. Lioma, B. Larsen, and W. Lu, "Rhetorical Relations for Information Retrieval," in Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2012, pp. 931-940. [DOI:10.1145/2348283.2348407]

19. [19] Y. Ji and N. Smith, "Imported from Neural Discourse Structure for Text Categorization. (arXiv:1702.01829v1 [cs.CL]) http://arxiv.org/abs/1702.01829," Preprint, 2017. [DOI:10.18653/v1/P17-1092]

20. [20] W. Yin, H. Schütze, B. Xiang, and B. Zhou, "Abcnn: Attention-based convolutional neural network for modeling sentence pairs," arXiv Prepr. arXiv1512.05193, 2015.

21. [21] and A. I. Zhiguo Wang, Haitao Mi, "Semi-supervised clustering for short text via deep representation learning," in The 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2016.

22. [22] S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio, "Generating sentences from a continuous space," arXiv Prepr. arXiv1511.06349, 2015. [DOI:10.18653/v1/K16-1002] []

23. [23] P. Bhatia, Y. Ji, and J. Eisenstein, "Better Document-level Sentiment Analysis from RST Discourse Parsing," Emnlp, no. September, pp. 2212-2218, 2015. [DOI:10.18653/v1/D15-1263]

24. [24] M. Taboada, K. Voll, and J. Brooke, "Extracting sentiment as a function of discourse structure and topicality," Tech. Rep., vol. 20, pp. 1-22, 2008.

25. [25] Y. Liu and M. Lapata, "Learning Structured Text Representations," arXiv Prepr. arXiv1705.09207, 2017.

26. [26] M. Kraus and S. Feuerriegel, "Sentiment analysis based on rhetorical structure theory: Learning deep neural networks from discourse trees," arXiv Prepr. arXiv1704.05228, 2017.

27. [27] C. D. Manning, P. Ragahvan, and H. Schutze, An Introduction to Information Retrieval, no. c. 2009.

28. [28] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality arXiv : 1310 . 4546v1 [ cs . CL ] 16 Oct 2013," arXiv Prepr. arXiv1310.4546, pp. 1-9, 2013.

29. [29] J. Pennington, R. Socher, and C. D. Manning, "GloVe: Global Vectors for Word Representation," Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1532-1543, 2014. [DOI:10.3115/v1/D14-1162]

30. [30] R. Socher, "Recursive Deep Learning for Natural Language Processing and Computer Vision," PhD thesis, no. August, 2014.

31. [31] Y. Ji and J. Eisenstein, "Representation Learning for Text-level Discourse Parsing," Proc. 52nd Annu. Meet. Assoc. Comput. Linguist., pp. 13-24, 2014. [DOI:10.3115/v1/P14-1002]

32. [32] G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Inf. Process. Manag., vol. 24, no. 5, pp. 513-523, 1988. [DOI:10.1016/0306-4573(88)90021-0]

33. [33] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation David," J. Mach. Learn. Res., vol. 3, no. Jan, pp. 993-1022, 2003.

34. [34] S. T. Dumais, G. W. Furnas, T. K. Landauer, S. Deerwester, and R. Harshman, "Using latent semantic analysis to improve access to textual information," in Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '88, 1988, pp. 281-285. [DOI:10.1145/57167.57214]

35. [35] C. Goller and A. Kuchler, "Learning task-dependent distributed representations by backpropagation through structure," Proceedings of International Conference on Neural Networks (ICNN'96), vol. 1. pp. 347-352, 1996.

36. [36] M. Morey, P. Muller, and N. Asher, "How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT," Emnlp, pp. 1330-1335, 2017. [DOI:10.18653/v1/D17-1136]

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.