Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/476280
Title: Binary relevance (BR) method classifier for multi-label classification of Arabic text
Authors: Adil Yaseen Taha (P72237)
Supervisor: Sabrina Tiun, Dr.
Keywords: Binary-coded decimal system
Algorithms
Universiti Kebangsaan Malaysia -- Dissertations
Dissertations, Academic -- Malaysia
Issue Date: 16-Aug-2015
Description: Multi-label text classification has become progressively more important in recent years, where each document can be given multiple labels concurrently. Multi-label text classification is a main challenging task because of the large space of all potential label sets, which is exponential to the number of candidate labels. Among the disadvantages of the earlier multi-label classification methods is that they typically do not scale up with the number of specific labels and the number of training examples. A large amount of computational time for classification is required for a large amount of text documents with high dimensionality, especially, the Arabic language which has a very complex morphology and rich nature. Furthermore, current researches have paid a little attention to the multi-label classification for Arabic text. Hence, this study aims to design and develop a new method for multi-label text classification for Arabic texts based on a chain of classifiers on binary relevance method. This chain of classifiers on binary relevance is made up from a different set of machine learning classifiers. Four multi-label classification approaches; namely the chain of SVM classifiers, the chain of KNN classifiers, the chain of NB classifiers and the chain of the different type of classifiers were empirically evaluated in this research. Moreover, three feature selection methods (Odd ratio, Chi-square and Mutual information) were studied and their performances were investigated to enhance the performance of the Arabic multi-label text classification. The objective is to efficiently incorporate classification algorithms and feature selection to create a more accurate multi-label classification process. To evaluate the model, a manually standard interpreted data is used. The results show that the collaborative machine learning binary classifiers chain which consists from a different set of machine learning classifiers attains the best result. It has achieved a good performance, with an overall F-measure of 86.8% for the multi-label classification of Arabic text. Besides, the results show an important effect from the used feature selection methods on the classification. Distinctly, the chain of the different set of algorithms proves to be an efficient and suitable method for the Arabic multi-label text classification.,Master of Computer Science
Pages: 91
Call Number: QA75.T334 2015 3 tesis
Publisher: UKM, Bangi
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
File Description SizeFormat 
ukmvital_80526+SOURCE1+SOURCE1.0.PDF
  Restricted Access
549.91 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.