Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/475660
Title: Enhanced word weighting of backpropagation neural network for Arabic stemming
Authors: Khaldoon Mezher Muhi (P74150)
Supervisor: Nazlia Omar, Associate Professor Dr.
Keywords: Arabic stemming
Word weight
Backpropagation
Learning technique
Dissertations, Academic -- Malaysia
Issue Date: 14-Feb-2016
Description: Stemming aims for identify the roots or stem of each word by eliminating the grammatical affixes or lexical information. Many approaches have been proposed in terms of stemming for various languages. Unlike English language, Arabic stemming is not a simple task of eliminating suffixes and prefixes. Arabic has a complex morphology where the process of stemming requires comprehensive analysis for such morphology. Several approaches have been proposed to enhance Arabic stemming. However, the state of the art of Arabic stemming algorithms such as Khoja and Light stemmer were built using rule-based approach. Hence, recent researches have attempted to examine the machine learning technique regarding to its capability of identifying word‟s stem based on training. Nonetheless, there is still room for improvement in terms of addressing new features that have the ability to assign accurate weight for the words. In fact, assigning an accurate weight for the words plays an essential role in terms of classifying its actual root. Therefore, this study aims to develop a set of features that would enhance the process of weighting for words. The proposed set of features consists of affixes, word length and tenses. Then, the acquired weight from the proposed features will be entered as an input for a backpropagation neural network classifier in order to output the actual stem. The corpus that has been used in this study is an Open Source Arabic Corpus (OSAC), which consists of 21,861 documents that are distributed among multiple topics including economy, sport, social, religion, stories, health, law, astronomy and food recipes. Two stemmers i.e. Khoja and Light stemmers have been used as baseline for comparison with the proposed method. The experimental results have shown that the proposed method has outperformed both of Khoja and Light stemmers by achieving 86% precision and 84% recall. In addition, a comparison has been performed with a state of the art of BPNN for Arabic stemming. The proposed method also has shown superior performance.,Certification of Master's/Doctoral Thesis" is not available
Pages: 92
Publisher: UKM, Bangi
Appears in Collections:Faculty of Science and Technology / Fakulti Sains dan Teknologi

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.