Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/513276
Title: | A hybrid rules and statistical method for Arabic to English machine translation |
Authors: | Arwa Hatem Qassim (P65613) |
Supervisor: | Nazlia Omar, Assoc. Prof. Dr. |
Keywords: | Machine translation Arabic language English language Machine translating |
Issue Date: | 2017 |
Description: | Machine translation (MT) represents text translation by computer from one language to another. The main problem in this field is finding a high-quality MT that meets human requirements. This issue is particularly challenging for translations between the Arabic and English languages. The reason is that Arabic is a rich and complex morphological language that is significantly different from other languages. This characteristic leads to specific problems such as different knowledge of sentence structure rules, word order patterns, affixes, and ambiguity between languages. Arabic is a highly inflectional language with rich morphology, relatively free word order, and a variety of sentence structures such as subject–object–verb, subject–verb–object, verb–subject–object, and verb–object–subject. The language has a large number of prefixes, suffixes, and infixes that can modify a stem to form words, thereby leading to a large vocabulary. Ambiguity may occur when a sentence or a phrase has more than one structure or meaning. The objective of this study is to propose a hybrid method consisting of rules and statistical approaches for Arabic-to-English MT. The rule-based approach includes 93 rules developed on the basis of basic rules to solve problems related to word reordering and affixes to enhance the quality of translation from Arabic to English. The approach achieved 70% precision with 1-gram model in the bilingual evaluation understudy (BLEU) system. This study also proposed a statistical approach to handle the ambiguity problem by using Expectation Maximization algorithm to estimate word translation probabilities for selecting the translation word based on collocation of word translation. The Expectation Maximization approach achieved 76% precision with 1-gram model in the BLEU system. To further improve the results, the study designed a new approach, which is a hybrid of the rule-based approach and the Expectation Maximization algorithm. The hybrid approach has the advantage of combining the positive element of rule-based approach by using huge numbers of rules to handle word ordering problem, with the positive element of statistical approach by using a selected translation word based on collocation to solve the ambiguity problem. The proposed approach significantly outperformed other available systems, helped improve the translation quality, and addressed the Arabic word ordering and ambiguity problems. The evaluation results show that the approach achieved 89% precision with 1-gram model in the BLEU system. This study also proposes a new statistical evaluation metric called Holder mean to assess the MT quality by considering the size of word order differences based on the distance between the words in a sentence. The performance of the approach is tested on the United Nations Arabic–English parallel corpus. The results of the evaluation hybrid approach achieved 91.9% in the Holder mean metric.,Certification of Master's/Doctoral Thesis" is not available |
Pages: | 198 |
Call Number: | P308.Q337 2017 3 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_96928+SOURCE1+SOURCE1.0.PDF Restricted Access | 620.41 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.