Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/476188
Title: | Rule-based approach for extracting the nationality of suspect of victim in crime news documents |
Authors: | Mohamad Darwich (P69644 ) |
Supervisor: | Masnizah Binti Mohd, Professor Dr. |
Keywords: | Crime domain |
Issue Date: | Jul-2014 |
Description: | In the crime domain, it is critical that crime analysts and investigators access criminal justice data and intelligence on crime cases efficiently and effectively to perform investigations. There is valuable information in unstructured crime news documents that analysts must manually search for, which is impractical. To solve this issue, several information extraction systems have been implemented exclusively for the crime domain. In this work, these information extraction systems were reviewed and analyzed. Most of these systems use techniques that are capable of being further enhanced to generate more accurate results. This gap has brought up the motivation to propose an enhanced information extraction system that uses named entity recognition to extract the nationality entity from crime news documents, and uses coreference resolution to associate the nationality to either the suspect or victim. The system uses a hybrid approach based on the lexical lookup and rule-based approaches. Three components define the architecture of the system, which are the preprocessing component, the nationality extractor component, and the victim or suspect reference identification component. After the system extracts the nationality, it references this nationality to the suspect or victim by looking up all of the victim related keywords and the suspect related keywords within the text, and their corresponding distances from the position of the nationality keyword. Based on their total distances, a rule-based algorithm decides whether the nationality is more likely to belong to either the victim or the suspect. Two experiments were conducted to evaluate the nationality extractor component and the reference identification component used by the model. The former experiment had achieved 90%, 94%, and 91% for precision, recall, and F-measure values respectively. The latter experiment had achieved 65%, 68%, and 66% for precision, recall, and F-measure values respectively. After evaluation of the system, it was concluded that the system was feasible, and able to provide more accurate results in comparison to other systems. This system may be the foundation for future work on extracting the nationality from crime news documents and reference identification to the suspect or victim.,Master/Sarjana |
Pages: | 72 |
Publisher: | UKM, Bangi |
URI: | https://ptsldigital.ukm.my/jspui/handle/123456789/476188 |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_75240+Source01+Source010.PDF Restricted Access | 2.44 MB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.