دوره 15، شماره 1 - ( 3-1397 )                   جلد 15 شماره 1 صفحات 126-115 | برگشت به فهرست نسخه ها


XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Salami S, Shamsfard M. Phrase-Boundary Translation Model Using Shallow Syntactic Labels. JSDP 2018; 15 (1) :115-126
URL: http://jsdp.rcisp.ac.ir/article-1-540-fa.html
سلامی شهرام، شمس فرد مهرنوش. مدل ترجمه عبارت-مرزی با استفاده از برچسب‌های کم‌عمق نحوی . پردازش علائم و داده‌ها 1397; 15 (1) :126-115

URL: http://jsdp.rcisp.ac.ir/article-1-540-fa.html


دانشگاه شهید بهشتی
چکیده:   (3570 مشاهده)

مدل عبارت-مرزی برای ترجمه ماشینی آماری، قواعد را با طبقه کلمات مرزی عبارات پیکره مقصد برچسب می­زند. در این مقاله مدل عبارت-مرزی را با استفاده از برچسب­های کم‌عمق نحوی شامل برچسب POS و برچسب قطعات توسعه می­دهیم. با اولویت برچسب قطعات، مدل پیشنهادی، غیرپایانه­ها را با برچسب­های کم‌عمق نحوی در مرز عبارات مقصد نام­‌گذاری می­‌کند. در قیاس با مدل  SAMT که قواعد را با درخت تجزیه نحوی جملات مقصد برچسب می­‌زند، مدل پیشنهادی به تجزیه عمیق نحوی نیاز ندارد. همچنین، هرچه تفاوت ترتیب کلمات زبان مبداء و مقصد ترجمه بیشتر باشد، عبارات تراز‌شده قابل انطباق با درخت تجزیه نحوی، کمتر خواهد بود. تعدادی آزمایش در ترجمه از فارسی و آلمانی به انگلیسی به‌عنوان جفت‌زبان­‌هایی با تفاوت زیاد در ترتیب کلمات انجام شد. در این آزمایش‌ها، مدل عبارت-مرزی پیشنهادی نسبت به مدل SAMT در حدود 5/0 واحد BLEU کیفیت ترجمه بهتری به‌دست آورد.

متن کامل [PDF 4539 kb]   (1532 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: مقالات پردازش متن
دریافت: 1395/6/10 | پذیرش: 1395/12/15 | انتشار: 1397/3/23 | انتشار الکترونیک: 1397/3/23

فهرست منابع
1. [1] رحیمی زینب، ثمنی محمد حسین، خدیوی شهرام. استخراج پیکره موازی از اسناد قابل مقایسه برای بهبود کیفیت ترجمه در سامانه‌های ترجمه ماشینی. پردازش علائم و داده‌ها،12(2)، 55-72، 1394
2. [1] Z. Rahimi, M. H. Samani, and S. Khadivi, "Extracting parallel corpus from comparable documents to improve the quality of translation in machine translation systems," Signal data Proc-ess., vol. 12, no. 2, pp. 55–72, 2015.
3. [2] D. Chiang, "A hierarchical phrase-based model for statistical machine translation," in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005, pp. 263–270. [DOI:10.3115/1219840.1219873]
4. [3] A. Zollmann and A. Venugopal, "Syntax augmented machine translation via chart parsing," in Proceedings of the Workshop on Statistical Machine Translation, 2006, pp. 138–141. [DOI:10.3115/1654650.1654671]
5. [4] S. Salami, M. Shamsfard, and S. Khadivi, "Phrase-boundary model for statistical machine transla-tion," Comput. Speech Lang., vol. 38, pp. 13–27, 2016. [DOI:10.1016/j.csl.2015.11.005]
6. [5] T. Watanabe, H. Tsukada, and H. Isozaki, "Left-to-right target generation for hierarchical phrase-based translation," in Proceedings of the 21st International Conference on Computational Lin-guistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 777–784. [DOI:10.3115/1220175.1220273]
7. [6] M. Huck, S. Peitz, M. Freitag, and H. Ney, "Discriminative reordering extensions for hierarchical phrase-based machine translation," in Proc. of the 16th Annual Conf. of the European Assoc. for Machine Translation, 2012, pp. 313–320.
8. [7] M. de B. Wenniger and K. Sima'an, "Labeling hierarchical phrase-based models without ling-uistic resources," Mach. Transl., vol. 29, no. 3–4, pp. 225–265, 2015. [DOI:10.1007/s10590-015-9177-0]
9. [8] Z. He, Q. Liu, and S. Lin, "Improving statistical machine translation using lexicalized rule selec-tion," in Proceedings of the 22nd In-ternational Conference on Computational Lin-guistics-Volume 1, 2008, pp. 321–328.
10. [9] R. Haque, S. Kumar Naskar, A. Van Den Bosch, and A. Way, "Supertags as source language context in hierarchical phrase-based SMT," in Association for Machine Translation in the Americas (AMTA 2010), 2010.
11. [10] B. Zhou, X. Zhu, B. Xiang, and Y. Gao, "Prior derivation models for formally syntax-based translation using linguistically syntactic parsing and tree kernels," in Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation, 2008, pp. 19–27. [DOI:10.3115/1626269.1626272]
12. [11] H. Almaghout, J. Jiang, and A. Way, "CCG augmented hierarchical phrase based machine-translation," 2010.
13. [12] H. Mino, T. Watanabe, and E. Sumita, "Syntax-Augmented Machine Translation using Syntax-Label Clustering.," in EMNLP, 2014, pp. 165–171.
14. [13] J. Li, Z. Tu, G. Zhou, and J. van Genabith, "Using syntactic head information in hierarchical phrase-based translation," in Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012, pp. 232–242.
15. [14] C. Cherry, "Improved Reordering for Phrase-Based Translation using Sparse Features.," in HLT-NAACL, 2013, pp. 22–31.
16. [15] A. Zollmann and S. Vogel, "A word-class approach to labeling pscfg rules for machine translation," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, 2011, pp. 1–11. [PMID]
17. [16] A. Zollmann, A. Venugopal, F. Och, and J. Ponte, "A systematic comparison of phrase-based, hierarchical and syntax-augmented statistical MT," in Proceedings of the 22nd International Conference on Computational Linguistics-Vo-lume 1, 2008, pp. 1145–1152. [DOI:10.3115/1599081.1599225]
18. [17] Z. He, Y. Meng, and H. Yu, "Discarding monotone composed rule for hierarchical phrase-based statistical machine translation," in Proceedings of the 3rd International Universal Communication Symposium, 2009, pp. 25–29. [DOI:10.1145/1667780.1667786]
19. [18] G. Iglesias, A. de Gispert, E. R. Banga, and W. Byrne, "Rule filtering by pattern for efficient hierarchical translation," in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009, pp. 380–388. [DOI:10.3115/1609067.1609109]
20. [19] S.-W. Lee, D. Zhang, M. Li, M. Zhou, and H.-C. Rim, "Translation model size reduction for hierarchical phrase-based statistical machine translation," in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, 2012, pp. 291–295.
21. [20] B. Sankaran, G. Haffari, and A. Sarkar, "Bayesian extraction of minimal scfg rules for hierarchical phrase-based translation," in Procee-dings of the Sixth Workshop on Statistical Mach-ine Translation, 2011, pp. 533–541.
22. [21] B. Sankaran, G. Haffari, and A. Sarkar, "Compact rule extraction for hierarchical phrase-based translation," in The 10th biennial conference of the Association for Machine Translation in the Americas (AMTA), San Diego, CA. Association for Computational Linguistics, 2012.
23. [22] S. C. of ICT, "Mizan English-Persian Parallel C-orpus," 2013.
24. [23] P. Koehn, "Europarl: A parallel corpus for statistical machine translation," in MT summit, 2005, vol. 5, pp. 79–86.
25. [24] D. Chiang, "Hierarchical phrase-based transla-tion," Comput. Linguist., vol. 33, no. 2, pp. 201–228, 2007. [DOI:10.1162/coli.2007.33.2.201]
26. [25] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "BLEU: a method for automatic evaluation of machine translation," in Proceedings of the 40th annual meeting on association for computational linguistics, 2002, pp. 311–318.
27. [26] P. Koehn, Statistical machine translation. Cam-bridge University Press, 2009. [DOI:10.1017/CBO9780511815829]
28. [27] Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. N. G. Thornton, J. Weese, and O. F. Zaidan, "Joshua: An open source toolkit for parsing-based machine translation," in Proceedings of the Fourth Workshop on Statistical Machine Translation, 2009, pp. 135–139. [DOI:10.3115/1626431.1626459]
29. [28] F. J. Och and H. Ney, "Improved statistical alignment models," in Proceedings of the 38th Annual Meeting on Association for Computa-tional Linguistics, 2000, pp. 440–447. [DOI:10.3115/1075218.1075274]
30. [29] A. Pauls and D. Klein, "Faster and smaller n-gram language models," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, 2011, pp. 258–267.
31. [30] F. J. Och, "Minimum error rate training in statistical machine translation," in Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, 2003, pp. 160–167. [DOI:10.3115/1075096.1075117]
32. [31] M. Post, J. Ganitkevitch, L. Orland, J. Weese, Y. Cao, and C. Callison-Burch, "Joshua 5.0: Sparser, better, faster, server," in Proceedings of the Eighth Workshop on Statistical Machine Tra-nslation, 2013, pp. 206–212. [PMID] [PMCID]
33. [32] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," J. Mach. Learn. Res., vol. 12, pp. 2493–2537, 2011.
34. [33] D. Klein and C. D. Manning, "Accurate unlexicalized parsing," in Proceedings of the 41st Annual Meeting on Association for Computa-tional Linguistics-Volume 1, 2003, pp. 423–430. [DOI:10.3115/1075096.1075150]
35. [34] P. Koehn, F. J. Och, and D. Marcu, "Statistical phrase-based translation," in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, 2003, pp. 48–54. https://doi.org/10.3115/1073445.1073462 [DOI:10.21236/ADA461156]
36. [35] P. Koehn, "Statistical Significance Tests for Machine Translation Evaluation.," in EMNLP, 2004, pp. 388–395.
37. [36] P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, and others, "Moses: Open source toolkit for statistical machine translation," in Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, 2007, pp. 177–180. [DOI:10.3115/1557769.1557821]

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این تارنما متعلق به فصل‌نامة علمی - پژوهشی پردازش علائم و داده‌ها است.