Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/513214
Title: Enhancing text clustering based on nouns verbs extraction and water cycle clustering for crime trend patterns
Authors: Qusay Walid Bsoul (P64618)
Supervisor: Juhana Salim, Prof. Dr.
Keywords: Text clustering
Nouns verbs
Water Cycle Clustering
Crime
Dissertations, Academic -- Malaysia
Issue Date: 7-Sep-2016
Description: News crime are on the rise and their prevention has become priority great concern for the police, the government and victims. One challenge posed by intelligence and law enforcement agencies rests in the difficulty in performing analysis of large volume of crime reports. crime pattern is provide an explanation of the reasons behind committing crimes in specific areas, and to obtaining better understand of these circumstances enables introducing mechanisms for changing the environment pertinent to such existing crimes in order to reduce the opportunities for particular crimes. Therefore, there is the need to use text clustering to analyse crime news to understand more clearly the criminal activity by find crime trend patterns. The weaknesses of clustering crime news is concerned with the process of extraction, where most important crime words focused on verbs and nouns. However, verbs and nouns as extraction have not been used for extraction in crime news analysis. Other problem related to clustering is the problem of choosing the initial centroids to each cluster randomly. The aim of this research is to analyse crime news using noun verb extraction and Water Cycle Clustering (WCC) to find crime trend patterns. Five phases were employed to achieve the aim. The first four phases aim to increase the quality of news crime clustering using nouns verbs extraction and WCC as the clustering algorithm. The 6 benchmark datasets and two real crime datasets were used to test the proposed crime news clustering. The external, internal and statistical evaluation were used to verify the proposed clustering. The last phase analyse four groups of news crime patterns using Name entity recognition. The datasets used consist of 3527 events in 2010 and 2011 respectively. The three main methods of analysis used are the analysis of the crime patterns, crime offence and crime location to extract information. The knowledge extracted are needed to choose suitable strategies. The result showed the nouns verbs method extraction outperformed other methods of extraction. In addition, semantic is evaluated through the use of nouns verbs as the method proposed. The performance is found to be worse than if semantic is not used. To fill the second gap related to crime clustering, the new meta-heuristic which is WCC was employed as the clustering algorithm, combining K-means algorithms in three different ways. In comparison with other traditional clustering algorithms, other meta-heuristic clustering and Water Cycle Clustering combine with k-means (WCCK-means) outperforms other cluster algorithms even when using WCC itself. Based on proposed crime clustering, the four crime patterns were generated to extract knowledge such as percentage of crimes. The findings of the analysis showed that in Malaysia, the categories of crime which have the highest occurrences are crime related to drugs, gangs involved in drug trafficking and murder.,Certification of Master's/Doctoral Thesis" is not available
Pages: 232
Publisher: UKM, Bangi
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
File Description SizeFormat 
ukmvital_82186+SOURCE1+SOURCE1.0.PDF
  Restricted Access
664.31 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.