Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/476551
Title: | Named Entity Recognition for political domain in Arabic language |
Authors: | Halema H Mhamed Alshref (P50193) |
Supervisor: | Mohd Juzaiddin Aziz, Dr. |
Keywords: | Arabic language Politics Named Entity Recognition (NER) Universiti Kebangsaan Malaysia -- Dissertations |
Issue Date: | 3-Aug-2012 |
Description: | Named Entity Recognition (NER) is an information extraction subtask which seeks to identify and classify proper nouns in a text, such as a person, location, organization, number and temporal value. NER plays a significant role in various types of NLP applications, especially in information extraction, information retrieval, machine translation, syntactic parsing/chunking and question-answering. The named entity recognition task is considerably more challenging when it is targeting a morphologically complex language such as Arabic due to its complexity. The main aim of this research is to use the rule based approach to design and implement a NER system for the political domain. The rule based approach consists of a lexicon, in the form of verb contextual clue list and noun contextual clue list, together with a set of grammar rules which are responsible for recognizing and classifying named entities. The rule based approach is suitable for NER because it provides flexibility and adaptability and it can be easily configured to work with different domains. In addition, it is suitable for languages that do not have a large multi-domain annotated corpus like Arabic. We have evaluated our approach using the ANER corpus. The results show that our approach achieves an overall accuracy of 94.86%. The results show that the our approach achieved an accuracy of 82.76% for Person NE, 98.3% for Location, 100% for Organization and 98.37% for MISC. The results show that our approach achieved state-of-the-art results and outperformed related Arabic works,Certification of Master's/Doctoral Thesis" is not available |
Pages: | 120 |
Call Number: | QA76.9.N38A587 2012 3tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_119305+SOURCE1+SOURCE1.1.PDF Restricted Access | 1.01 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.