Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/476358
Title: | Named Entity Recognition in crime news documents using classifiers combination |
Authors: | Hafedh Ali Shabat (P72246) |
Supervisor: | Nazlia Omer, Assoc. Prof. Dr. |
Keywords: | Named Entity Recognition Crime news Classifiers Dissertations Academic - Malaysia |
Issue Date: | 10-Jun-2015 |
Description: | Most of the crimes committed today are reported on the internet by news articles, blogs and social networking sites. With the increasing volume of generated crime information readily available on the web, there is a need to retrieve and exploit relevant information in order to prevent crime more efficiently and effectively. Named Entity Recognition (NER) which is a task to automatically identify named entities in texts can be used to assist this process. However, the task of NER can be more challenging in a specific domain like crime, since entities are more exact to that particular domain. Most current researches focus on extracting general types of entities, and pay little attention to the extraction of crime related entities. Thus, this study aims to design and develop a named entity recognition model for crime news documents based on a combination of classifiers. Firstly, the crime type is identified followed by the identification of named entities such as weapon, nationality and location from crime news documents. A machine learning classification framework is proposed based on Naive Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbour (KNN) classifiers. To improve the named entity recognition from crime documents, a list of suitable features has been identified. To evaluate the model, a manually annotated data set that is obtained from BERNAMA is used. The combined model achieves an overall F-measure of 89.48% for identifying crime type and 93.36% for extracting crime-related entities. The results of the combined models trained with suitable features outperform baseline models. This proves that the proposed model is viable and can be used to identify named entities in crime news documents.,Certification of Master's/Doctoral Thesis" is not available |
Pages: | 99 |
Publisher: | UKM, Bangi |
URI: | https://ptsldigital.ukm.my/jspui/handle/123456789/476358 |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_82997+SOURCE1+SOURCE1.0.PDF Restricted Access | 221.09 kB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.