پیش‌بینی عملکرد نتایج پرس‌وجو با کمک روش‌های بدون نظارت

کریمی, سیده فاطمه; خدابخش, مریم

doi:10.61186/jsdp.22.1.3

دوره 22، شماره 1 - ( 3-1404 ) جلد 22 شماره 1 صفحات 12-3 | برگشت به فهرست نسخه ها

‎ 10.61186/jsdp.22.1.3

Mendeley

Zotero

RefWorks

Karimi S, Khodabakhsh M. Unsupervised Methods for Predicting Query Performance. JSDP 2025; 22 (1) :3-12
URL: http://jsdp.rcisp.ac.ir/article-1-1407-fa.html

کریمی سیده فاطمه، خدابخش مریم. پیش‌بینی عملکرد نتایج پرس‌وجو با کمک روش‌های بدون نظارت. پردازش علائم و داده‌ها. 1404; 22 (1) :3-12

URL: http://jsdp.rcisp.ac.ir/article-1-1407-fa.html

پیش‌بینی عملکرد نتایج پرس‌وجو با کمک روش‌های بدون نظارت

سیده فاطمه کریمی

، مریم خدابخش^*

استادیار دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران

چکیده: (257 مشاهده)

در سال‌های اخیر، استفاده از موتورهای جست‌وجو افزایش روزافزون داشته و نیاز به توسعه روش‌های دقیق‌تر بازیابی و رتبه‌بندی اسناد بیشتر شده‌است؛ درنتیجه پیشبینی عملکرد موتورهای جست‌وجو، یکی از الزامات و چالش‌های بازیابی اطلاعات محسوب می‌شود. اگر بتوان عملکرد پرس‌وجوها را پیش از مرحله بازیابی یا بعد از آن تخمین زد، می‌توان اقدامات خاصی را برای بهبود بازیابی انجام داد. پیش‌بینی عملکرد پرس‌وجو بر تخمین دشواری برآوردن درخواست کاربر برای یک روش بازیابی خاص متمرکز است. این پژوهش، به بررسی عملکرد پرس‌وجو با کمک روش‌های پس از بازیابی می‌پردازد؛ در این راستا از روش‌های بدون نظارت استفاده می‌شود و به خوشه‌بندی و اندازه‌گیری معیارهای مختلف جهت ارزیابی عملکرد پاسخ‌دهی پرس‌وجوها می‌پردازیم؛ درنهایت کار خود را با روش‌های بدون نظارت موجود در ادبیات این حوزه مقایسه خواهیم کرد. نتایج نشان می‌دهد روش پیشنهادی پژوهش حاضر توانست ضریب اسپیرمن را در مجموعه داده TREC DL 2019 و DL-Hard به ترتیب 0.009 و 0.163 و در مجموعه داده TREC DL 2020 ضریب پیرسون را 0.037 نسبت به بهترین کار موجود افزایش دهد.

واژه‌های کلیدی: پیش‌بینی عملکرد پرس‌وجو، بازیابی اطلاعات، موتورهای جست‌وجو

متن کامل [PDF 769 kb] (120 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش متن
دریافت: 1402/8/22 | پذیرش: 1403/12/18 | انتشار: 1404/3/31 | انتشار الکترونیک: 1404/3/31

فهرست منابع

1. M. Khodabakhsh and E. Bagheri, "Learning to rank and predict: Multi-task learning for ad hoc retrieval and query performance prediction," Inf. Sci. (Ny)., vol. 639, 2023, doi: 10.1016/j.ins.2023.119015. [DOI:10.1016/j.ins.2023.119015]

2. N. Arabzadeh, M. Khodabakhsh, and E. Bagheri, "BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction," Int. Conf. Inf. Knowl. Manag. Proc., pp. 2857-2861, 2021, doi: 10.1145/3459637.3482063. [DOI:10.1145/3459637.3482063]

3. J. ForutanRad, M. HourAli, and M. KeyvanRad, "Farsi Question and Answer Dataset (FarsiQuAD)," Signal Data Process., vol. 20, no. 4, 2024, doi: 10.61186/jsdp.20.4.107. [DOI:10.61186/jsdp.20.4.107]

4. O. Khattab and M. Zaharia, "ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT," SIGIR 2020 - Proc. 43rd Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 39-48, 2020, doi: 10.1145/3397271.3401075. [DOI:10.1145/3397271.3401075]

5. L. Xiong et al., "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval," ICLR 2021 - 9th Int. Conf. Learn. Represent., pp. 1-16, 2021.

6. T. Formal, C. Lassance, B. Piwowarski, and S. Clinchant, From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective, vol. 1, no. 1. Association for Computing Machinery, 2022. doi: 10.1145/3477495.3531857. [DOI:10.1145/3477495.3531857]

7. R. Nogueira, Z. Jiang, R. Pradeep, and J. Lin, "Document ranking with a pretrained sequence-to-sequence model," Find. Assoc. Comput. Linguist. Find. ACL EMNLP 2020, pp. 708-718, 2020, doi: 10.18653/v1/2020.findings-emnlp.63. [DOI:10.18653/v1/2020.findings-emnlp.63]

8. T. A. Nakamura, P. H. Calais, D. de C. Reis, and A. P. Lemos, "An anatomy for neural search engines," Inf. Sci. (Ny)., vol. 480, pp. 339-353, 2019, doi: 10.1016/j.ins.2018.12.041. [DOI:10.1016/j.ins.2018.12.041]

9. G. Hirst, J. Lin, R. Nogueira, and A. Yates, "Pretrained Transformers for Text Ranking: BERT and beyond," Synth. Lect. Hum. Lang. Technol., vol. 14, no. 4, pp. 1-325, 2021, doi: 10.2200/S01123ED1V01Y2021 08HLT053.

10. K. Sparck Jones, S. Walker, and S. E. Robertson, "Probabilistic model of information retrieval: Development and comparative experiments. Part 2," Inf. Process. Manag., vol. 36, no. 6, pp. 809-840, 2000, doi: 10.1016/S0306-4573(00)00016-9. [DOI:10.1016/S0306-4573(00)00016-9]

11. J. Lafferty and C. Zhai, "Document language models, query models, and risk minimization for information retrieval," SIGIR Forum (ACM Spec. Interes. Gr. Inf. Retrieval), vol. 51, no. 2, pp. 111-119, 2001, doi: 10.1145/383952.383970. [DOI:10.1145/383952.383970]

12. E. Bagheri and F. N. Al-Obeidat, "A Latent Model for Ad Hoc Table Retrieval," Adv. Inf. Retr., vol. 12036, pp. 86-93, 2020, [Online]. Available: https://api.semanticscholar.org/Corpus ID:215746638 [DOI:10.1007/978-3-030-45442-5_11] []

13. N. Arabzadeh, F. Zarrinkalam, J. Jovanovic, and E. Bagheri, "Neural embedding-based metrics for pre-retrieval query performance prediction," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12036 LNCS, pp. 78-85, 2020, doi: 10.1007/978-3-030-45442-5_10. [DOI:10.1007/978-3-030-45442-5_10] []

14. C. Hauff, D. Hiemstra, and F. De Jong, "A survey of pre-retrieval query performance predictors," Int. Conf. Inf. Knowl. Manag. Proc., pp. 1419-1420, 2008, doi: 10.1145/1458082.1458311. [DOI:10.1145/1458082.1458311]

15. G. Faggioli, O. Zendel, J. S. Culpepper, N. Ferro, and F. Scholer, "sMARE: a new paradigm to evaluate and understand query performance prediction methods," Inf. Retr. J., vol. 25, no. 2, pp. 94-122, 2022, doi: 10.1007/s10791-022-09407-w. [DOI:10.1007/s10791-022-09407-w]

16. H. Roitman, S. Erera, and G. Feigenblat, "A study of query performance prediction for answer quality determination," ICTIR 2019 - Proc. 2019 ACM SIGIR Int. Conf. Theory Inf. Retr., pp. 43-46, 2019, doi: 10.1145/3341981.3344219. [DOI:10.1145/3341981.3344219]

17. G. Faggioli, T. Formal, S. Marchesin, S. Clinchant, N. Ferro, and B. Piwowarski, "Query Performance Prediction for Neural IR: Are We There Yet?," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 13980 LNCS, pp. 232-248, 2023, doi: 10.1007/978-3-031-28244-7_15. [DOI:10.1007/978-3-031-28244-7_15]

18. S. Sarnikar, Z. Zhang, and J. Zhao, "Query-Performance Prediction for Effective Query Routing in Domain-Specific Repositories," J. Assoc. Inf. Sci. Technol., vol. 65, 2014, doi: 10.1002/asi.23072. [DOI:10.1002/asi.23072]

19. D. Carmel and E. Yom-Tov, Estimating the query difficulty for information retrieval. Morgan & Claypool Publishers, 2010. [DOI:10.1007/978-3-031-02272-2]

20. J. S. Culpepper, G. Faggioli, N. Ferro, and O. Kurland, "Topic Difficulty: Collection and Query Formulation Effects," ACM Trans. Inf. Syst., vol. 40, no. 1, Sep. 2021, doi: 10.1145/3470563. [DOI:10.1145/3470563]

21. G. Faggioli, S. Lupart, S. Marchesin, N. Ferro, and B. Piwowarski, "Towards Query Performance Prediction for Neural Information Retrieval : Challenges and Opportunities," pp. 51-63, doi: 10.1145/3578337.3605142. [DOI:10.1145/3578337.3605142]

22. M. Khodabakhsh and E. Bagheri, "Semantics-enabled query performance prediction for ad hoc table retrieval," Inf. Process. Manag., vol. 58, no. 1, p. 102399, 2021, doi: 10.1016/j.ipm.2020.102399. [DOI:10.1016/j.ipm.2020.102399]

23. S. Cronen-Townsend, Y. Zhou, and W. B. Croft, "Predicting query performance," SIGIR Forum (ACM Spec. Interes. Gr. Inf. Retrieval), pp. 299-306, 2002, doi: 10.1145/564426.564429. [DOI:10.1145/564426.564429]

24. F. Raiber and O. Kurland, "Query-performance prediction: Setting the expectations straight," SIGIR 2014 - Proc. 37th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 13-22, 2014, doi: 10.1145/2600428.2609581. [DOI:10.1145/2600428.2609581]

25. H. Roitman, S. Erera, O. Sar-Shalom, and B. Weiner, "Enhanced mean retrieval score estimation for query performance prediction," ICTIR 2017 - Proc. 2017 ACM SIGIR Int. Conf. Theory Inf. Retr., no. October, pp. 35-42, 2017, doi: 10.1145/3121050.3121051. [DOI:10.1145/3121050.3121051]

26. H. Roitman, S. Erera, and B. Weiner, "Robust standard deviation estimation for query performance prediction," ICTIR 2017 - Proc. 2017 ACM SIGIR Int. Conf. Theory Inf. Retr., pp. 245-248, 2017, doi: 10.1145/3121050.3121087. [DOI:10.1145/3121050.3121087]

27. J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171-4186, 2019.

28. M. and et al. Robertson, Stephen E and Walker, Steve and Jones, Susan and Hancock-Beaulieu, Micheline M and Gatford, "Okapi at TREC-3," Nist Spec. Publ. Sp, vol. 109, p. 109, 1995. [DOI:10.6028/NIST.SP.500-225.city]

29. J. Sander, M. Ester, H. P. Kriegel, and X. Xu, "Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications," Data Min. Knowl. Discov., vol. 2, no. 2, pp. 169-194, 1998, doi: 10.1023/A:1009745219419. [DOI:10.1023/A:1009745219419]

30. N. Craswell, B. Mitra, E. Yilmaz, and D. Campos, "Overview of the TREC 2019 deep learning track," pp. 1-22, 2020, [Online]. Available: http://arxiv.org/abs/2102.07662 [DOI:10.6028/NIST.SP.1266.deep-overview]

31. N. Craswell, B. Mitra, E. Yilmaz, and D. Campos, "Overview of the TREC 2020 deep learning track," pp. 1-13, 2021, [Online]. Available: http://arxiv.org/abs/2102.07662 [DOI:10.6028/NIST.SP.1266.deep-overview]

32. I. MacKie, J. Dalton, and A. Yates, "How Deep is your Learning: The DL-HARD Annotated Deep Learning Dataset," SIGIR 2021 - Proc. 44th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., no. July, pp. 2335-2341, 2021, doi: 10.1145/3404835.3463 262. [DOI:10.1145/3404835.3463262]

33. Y. Z. and W. B. Croft, "Query performance prediction in web search environments," in " in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007, pp. 543-550. [DOI:10.1145/1277741.1277835] []

34. D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum, "Hierarchical topic models and the nested Chinese restaurant process," Adv. Neural Inf. Process. Syst., no. May 2004, 2004.

35. A. Singh, D. Ganguly, S. Datta, and C. Macdonald, Unsupervised Query Performance Prediction for Neural Models with Pairwise Rank Preferences, vol. 1, no. 1. Association for Computing Machinery, 2023. doi: 10.1145/3539618.3592082. [DOI:10.1145/3539618.3592082]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.

نظر شما در مورد قالب جدید چیست؟
	خوب
	متوسط
	ضعیف

پایگاه‌های مرتبط

واژگان کلیدی

نظرسنجی