Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/475657
Title: Named entity recognition in English translation of hadith documents using classifiers combination
Authors: Mohanad Jasim Jaber (P74140)
Supervisor: Saidah Saad, Dr.
Keywords: Named entity recognition
English translations
Hadith documents
Classifiers combination
Dissertations, Academic -- Malaysia
Issue Date: 15-Nov-2013
Description: With the increasing number of English translations of Islamic documents that are available on the internet, there is a need to retrieve and extract important information, in order to fully understanding these documents. The task of named entity recognition becomes more difficult in translated documents from a specific domain, like the Islamic domain, since entities are more exact to that particular domain. Furthermore, translated named entities have their own characteristics and available annotated English corpora do not cover the transliterated Arabic names. Most current researches focus on extracting named entities from English text, and pay little attention to extracting named entities from English translations of Islamic documents. This study therefore aims to design and develop a named entity recognition model for English translations of Hadith documents. The method applied in this study involves pre-processing techniques, features extraction, machine learning techniques, as well as evaluation. A machine learning classification framework is proposed based on a combination (ensemble) of Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB) classifiers. The main idea behind this classifier ensemble is that it is often much more accurate than the individual classifiers alone. To enhance the named entity recognition obtained from English translations of Hadith documents, a list of suitable features has been identified. To evaluate the model, a manually annotated dataset, which is obtained from English translations of Muslim books of Hadith, is used. The results show that the voting combination approach outperformed single classifiers. The combined model achieves a good performance, with an overall F-measure of 95.3% for identifying named entities. These results prove that the combined models, trained with suitable features, are more suitable to recognize named entities of translated Hadith texts than baseline models.,Certification of Master's/Doctoral Thesis" is not available
Pages: 88
Publisher: UKM, Bangi
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.