Abstract: (91 Views)
In recent years, the use of search engines has been increasing rapidly, and there is a growing need for the development of more accurate information retrieval and ranking methods. As a result, predicting the performance of search engines has become one of the requirements and challenges of information retrieval. If we can estimate the performance of queries before or after the retrieval stage, specific actions can be taken to improve retrieval. The goal of Query Performance Prediction is focused on estimating the difficulty of fulfilling user requests for a specific retrieval method. Existing research in this field can be divided into two main categories: 1. Pre-retrieval prediction and 2. Post-retrieval prediction. Pre-retrieval predictors estimate the performance of a query before reaching the retrieval stage, so they are independent of the ranking list of results. On the other hand, post-retrieval prediction methods rely on the relationship between the query and the information available in the document collection to predict search engine performance. In this study, we will focus on Query Performance Prediction using post-retrieval methods. We will employ unsupervised methods and cluster the retrieved documents for each query. Then, we will define five metrics: CC, DCIC, DCNIC, DCNICR, and CCR, which will be calculated based on the clustering results, and we will evaluate the performance of query responses by measuring these metrics. Finally, we will compare our work with state-of-the-art unsupervised methods. The results indicate that our method was able to improve the Spearman correlation coefficient by 0.009 and 0.163 on the TREC DL 2019 and DL-Hard datasets, respectively. Furthermore, it increased the Pearson correlation coefficient by 0.037 on the TREC DL 2020 dataset compared to the best existing studies.
Type of Study:
Research |
Subject:
Paper Received: 2023/11/13 | Accepted: 2025/03/8 | Published: 2025/06/21 | ePublished: 2025/06/21