Toward Building a Comprehensive Phrase-based English-Arabic Statistical Machine Translation System | ||||
The Egyptian Journal of Language Engineering | ||||
Article 2, Volume 4, Issue 2, September 2017, Page 10-26 PDF (926.22 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/ejle.2017.59427 | ||||
View on SCiNiTO | ||||
Authors | ||||
Sara Ebrahim 1; Samha R. El-Beltagy2; Doaa Hegazy3; Mostafa G. Mostafa4 | ||||
1Scientific Computing Department, Faculty of Computer and Information Sciences (FCIS), Ain Shams University, Cairo, Egypt | ||||
2Nile University (NU), Center for Informatics Science | ||||
3Scientific Computing Department, Faculty of Computer and Information Sciences (FCIS), Ain Shams University, Cairo, Egypt. | ||||
4Computer Science at the Faculty of Computer and Information Sciences (FCIS), Ain Shams University | ||||
Abstract | ||||
This paper explores a phrase-based statistical machine translation (PBSMT) pipeline for English-Arabic (En-Ar) language pair. The work surveys the most recent experiments conducted to enhance Arabic machine translation in the En-Ar direction. It also focuses on free datasets and linguistically motivated ideas that enhance phrase-based En-Ar statistical machine translation (SMT) as it is as aims to use those only in order to build a large scale En-Ar SMT system. In addition, the paper highlights Arabic linguistic challenges in Machine Translation (MT) in general. This paper can be considered a guide for building an En-Ar PBSMT system. Furthermore, the presented pipeline can be generalized to any language pairs. | ||||
Keywords | ||||
Machine Translation; Arabic Natural Language Processing; Phrase-based; Statistical machine translation | ||||
Statistics Article View: 154 PDF Download: 564 |
||||