Volume 20, Issue 1 (6-2023)                   JSDP 2023, 20(1): 181-197 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Gharavi E, Veisi H. Using RST-based deep neural networks to improve text representation. JSDP 2023; 20 (1) : 12
URL: http://jsdp.rcisp.ac.ir/article-1-1004-en.html
University of Tehran
Abstract:   (479 Views)
Finding a highly informative, low-dimensional representation for texts, specifically long texts, is one of the main challenges for natural language processing (NLP) tasks. For texts longer than sentences or a paragraph, finding a good representation beyond the bag-of-words model without losing word order is still a challenge. This representation should capture the semantic and syntactic information of the text while retaining relevance for large-scale similarity search and accurate text classification. We propose the utilization of Rhetorical Structure Theory (RST) to consider the text structure in the representation. RST creates a tree-structure format for the text document and model the importance and relationship between sentences or phrases. In this paper, we examine the effect of using this structure on two different NLP tasks. In information retrieval, to embed document relevance in distributed representation, we use a Siamese neural network to jointly learn document representations. Our Siamese network consists of two sub-networks of recursive neural networks (RNN) built over the RST tree. For this task, we use a subset of Reuters’s news corpus and BBC news dataset. The results show that our approach outperforms conventional text representations like tf_idf, LDA, LSA and word vector averaging. The proposed representation beats the best conventional method by %6 and %3 in precision at k retrieved documents on BBC and Reuters datasets, respectively. In the sentiment analysis task, first, we use an rst-based recursive neural network to represent movie reviews and classify the polarity of people’s opinions. Then we propose to use the nucleus-satellite information of a node in the rst-tree to build an attention mechanism by deep RNN to generate better discourse representations. We test the effectiveness of our approach on  sentiment analysis task, and we prove that considering the importance of the text span improves sentiment analysis performance by %3 on the internet movie review database. In this paper, we improve the text representation by the rst-based deep neural network. We can evaluate this approach on the other languages to show the effectiveness of using the structure format of the text.
 
Article number: 12
Full-Text [PDF 1659 kb]   (128 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2019/04/28 | Accepted: 2021/12/6 | Published: 2023/08/13 | ePublished: 2023/08/13

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing