Using RST-based deep neural networks to improve text representation

Gharavi, Erfaneh; Veisi, Hadi

doi:10.61186/jsdp.20.1.181

Volume 20, Issue 1 (6-2023) JSDP 2023, 20(1): 181-197 | Back to browse issues page

‎ 10.61186/jsdp.20.1.181

Mendeley

Zotero

RefWorks

Gharavi E, Veisi H. Using RST-based deep neural networks to improve text representation. JSDP 2023; 20 (1) : 12
URL: http://jsdp.rcisp.ac.ir/article-1-1004-en.html

Using RST-based deep neural networks to improve text representation

Erfaneh Gharavi

, Hadi Veisi ^*

University of Tehran

Abstract: (1266 Views)

Finding a highly informative, low-dimensional representation for texts, specifically long texts, is one of the main challenges for natural language processing (NLP) tasks. For texts longer than sentences or a paragraph, finding a good representation beyond the bag-of-words model without losing word order is still a challenge. This representation should capture the semantic and syntactic information of the text while retaining relevance for large-scale similarity search and accurate text classification. We propose the utilization of Rhetorical Structure Theory (RST) to consider the text structure in the representation. RST creates a tree-structure format for the text document and model the importance and relationship between sentences or phrases. In this paper, we examine the effect of using this structure on two different NLP tasks. In information retrieval, to embed document relevance in distributed representation, we use a Siamese neural network to jointly learn document representations. Our Siamese network consists of two sub-networks of recursive neural networks (RNN) built over the RST tree. For this task, we use a subset of Reuters’s news corpus and BBC news dataset. The results show that our approach outperforms conventional text representations like tf_idf, LDA, LSA and word vector averaging. The proposed representation beats the best conventional method by %6 and %3 in precision at k retrieved documents on BBC and Reuters datasets, respectively. In the sentiment analysis task, first, we use an rst-based recursive neural network to represent movie reviews and classify the polarity of people’s opinions. Then we propose to use the nucleus-satellite information of a node in the rst-tree to build an attention mechanism by deep RNN to generate better discourse representations. We test the effectiveness of our approach on sentiment analysis task, and we prove that considering the importance of the text span improves sentiment analysis performance by %3 on the internet movie review database. In this paper, we improve the text representation by the rst-based deep neural network. We can evaluate this approach on the other languages to show the effectiveness of using the structure format of the text.

Article number: 12

Keywords: Document embedding, Semantic representation, Rhetorical Structure Theory, Deep Neural network, Attention Mechanism

Full-Text [PDF 1659 kb] (444 Downloads)

Type of Study: Research | Subject: Paper
Received: 2019/04/28 | Accepted: 2021/12/6 | Published: 2023/08/13 | ePublished: 2023/08/13

References

1. [1] R. Socher, C. D. C. Manning, and A. Y. A. Ng, "Learning continuous phrase representations and syntactic parsing with recursive neural networks," Proc. NIPS-2010 Deep Learn. Unsupervised Featur. Learn. Work., pp. 1-9, 2010.

2. [2] R. Socher, C. Manning, B. Huval, and A. Ng, "Semantic compositionality through recursive matrix-vector spaces," in EMNLP-CoNLL '12: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 1201-1211.

3. [3] Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, "A Neural Probabilistic Language Model," J. Mach. Learn. Res., vol. 3, pp. 1137-1155, 2003.

4. [4] K. S. Tai, R. Socher, and C. D. Manning, "Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks," Proc. 53rd Annu. Meet. Assoc. Comput. Linguist. 7th Int. Jt. Conf. Nat. Lang. Process., pp. 1556-1566, 2015. [DOI:10.3115/v1/P15-1150]

5. [5] N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A Convolutional Neural Network for Modelling Sentences," Acl. pp. 655-665, 2014. [DOI:10.3115/v1/P14-1062]

6. [6] Q. V. Le and T. Mikolov, "Distributed Representations of Sentences and Documents," vol. 32, pp. 1188-1196, 2014.

7. [7] J. Märkle-Huß, S. Feuerriegel, and H. Prendinger, "Improving Sentiment Analysis with Document-Level Semantic Relationships from Rhetoric Discourse Structures," in Proceedings of the 50th Hawaii International Conference on System Sciences, 2017, pp. 1142-1151. [DOI:10.24251/HICSS.2017.135]

8. [8] A. Hogenboom, F. Frasincar, F. de Jong, and U. Kaymak, "Using rhetorical structure in sentiment analysis," Commun. ACM, vol. 58, no. 7, pp. 69-77, 2015. [DOI:10.1145/2699418]

9. [9] D. Marcu, "Discourse Trees are Good Indicators of Importance in Text," in Advances in Automatic Text Summarization, 1999, pp. 123-136.

10. [10] W. C. Mann and S. A. Thompson, "Rhetorical Structure Theory: Toward a functional theory of text organization," Text, vol. 8, no. 3, pp. 243-281, 1988. [DOI:10.1515/text.1.1988.8.3.243]

11. [11] D. Noel, Towards a functional characterization of the news of the BBC World Service. 1986.

12. [12] B. A. Fox, Discourse Structure and Anaphora: Written and Conversational English. Cambridge University Press, 1993.

13. [13] R. Salakhutdinov and G. Hinton, "Semantic hashing," Int. J. Approx. Reason., vol. 50, no. 7, pp. 969-978, 2009. [DOI:10.1016/j.ijar.2008.11.006]

14. [14] Q. Wang, D. Zhang, and L. Si, "Semantic hashing using tags and topic modeling," in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13, 2013, p. 213. [DOI:10.1145/2484028.2484037]

15. [15] M. A. Livermore, F. Dadgostari, M. Guim, P. Beling, and D. Rockmore, "Law Search as Prediction," Virginia Public Law Leg. Theory Res. Pap., no. 2018-61, 2018.

16. [16] P. Huang et al., "Learning Deep Structured Semantic Models for Web Search using Clickthrough Data," 22nd ACM Int. Conf. Conf. Inf. Knowl. Manag., pp. 2333-2338, 2013. [DOI:10.1145/2505515.2505665]

17. [17] J. Mueller, "Siamese Recurrent Architectures for Learning Sentence Similarity," Proc. 30th Conf. Artif. Intell. (AAAI 2016), no. 2012, pp. 2786-2792, 2016. [DOI:10.1609/aaai.v30i1.10350]

18. [18] C. Lioma, B. Larsen, and W. Lu, "Rhetorical Relations for Information Retrieval," in Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2012, pp. 931-940. [DOI:10.1145/2348283.2348407]

19. [19] Y. Ji and N. Smith, "Imported from Neural Discourse Structure for Text Categorization. (arXiv:1702.01829v1 [cs.CL]) http://arxiv.org/abs/1702.01829," Preprint, 2017. [DOI:10.18653/v1/P17-1092]

20. [20] W. Yin, H. Schütze, B. Xiang, and B. Zhou, "Abcnn: Attention-based convolutional neural network for modeling sentence pairs," arXiv Prepr. arXiv1512.05193, 2015.

21. [21] and A. I. Zhiguo Wang, Haitao Mi, "Semi-supervised clustering for short text via deep representation learning," in The 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2016.

22. [22] S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio, "Generating sentences from a continuous space," arXiv Prepr. arXiv1511.06349, 2015. [DOI:10.18653/v1/K16-1002] []

23. [23] P. Bhatia, Y. Ji, and J. Eisenstein, "Better Document-level Sentiment Analysis from RST Discourse Parsing," Emnlp, no. September, pp. 2212-2218, 2015. [DOI:10.18653/v1/D15-1263]

24. [24] M. Taboada, K. Voll, and J. Brooke, "Extracting sentiment as a function of discourse structure and topicality," Tech. Rep., vol. 20, pp. 1-22, 2008.

25. [25] Y. Liu and M. Lapata, "Learning Structured Text Representations," arXiv Prepr. arXiv1705.09207, 2017.

26. [26] M. Kraus and S. Feuerriegel, "Sentiment analysis based on rhetorical structure theory: Learning deep neural networks from discourse trees," arXiv Prepr. arXiv1704.05228, 2017.

27. [27] C. D. Manning, P. Ragahvan, and H. Schutze, An Introduction to Information Retrieval, no. c. 2009.

28. [28] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality arXiv : 1310 . 4546v1 [ cs . CL ] 16 Oct 2013," arXiv Prepr. arXiv1310.4546, pp. 1-9, 2013.

29. [29] J. Pennington, R. Socher, and C. D. Manning, "GloVe: Global Vectors for Word Representation," Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1532-1543, 2014. [DOI:10.3115/v1/D14-1162]

30. [30] R. Socher, "Recursive Deep Learning for Natural Language Processing and Computer Vision," PhD thesis, no. August, 2014.

31. [31] Y. Ji and J. Eisenstein, "Representation Learning for Text-level Discourse Parsing," Proc. 52nd Annu. Meet. Assoc. Comput. Linguist., pp. 13-24, 2014. [DOI:10.3115/v1/P14-1002]

32. [32] G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Inf. Process. Manag., vol. 24, no. 5, pp. 513-523, 1988. [DOI:10.1016/0306-4573(88)90021-0]

33. [33] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation David," J. Mach. Learn. Res., vol. 3, no. Jan, pp. 993-1022, 2003.

34. [34] S. T. Dumais, G. W. Furnas, T. K. Landauer, S. Deerwester, and R. Harshman, "Using latent semantic analysis to improve access to textual information," in Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '88, 1988, pp. 281-285. [DOI:10.1145/57167.57214]

35. [35] C. Goller and A. Kuchler, "Learning task-dependent distributed representations by backpropagation through structure," Proceedings of International Conference on Neural Networks (ICNN'96), vol. 1. pp. 347-352, 1996.

36. [36] M. Morey, P. Muller, and N. Asher, "How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT," Emnlp, pp. 1330-1335, 2017. [DOI:10.18653/v1/D17-1136]

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Signal and Data Processing

Vote