Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/513200
Title: | Modified biogeography-based optimisation algorithms for data clustering problems |
Authors: | Hammouri Abdelaziz Isam (P59629) |
Supervisor: | Salwani Abdullah, Prof. Dr. |
Keywords: | Biogeography Based Optimisation Clustering Cluster analysis- Data processing. |
Issue Date: | 7-Apr-2015 |
Description: | Clustering is an NP-hard problem where the search space of the problem grows exponentially with the size of the data. Clustering is used to find groups in unlabelled classes of data, in which the similar data are grouped together in one cluster. In the last few decades, a variety of metaheuristic methods have been developed as new clustering approaches, however to-date there is no single approach that is suitable for different types of data. Many of these approaches face some technical problems such as highly depends on the initialisation of its parameters, initial population can affect the quality of the final solution, and have weak exploration or exploitation capabilities. The aim of this research is to develop an alternative approach for finding high quality clusters for different kinds of datasets with different data properties. To achieve this aim, three variants of a nature-inspired algorithm are proposed. Two types of datasets are used i.e., six real benchmark datasets and fourteen synthetic datasets, which have different data properties. The research first investigates the parameter setting and selects the most suitable values for the Biogeography Based Optimisation (BBO) algorithm as a single-objective population-based clustering tool, and next, the performance of BBO is examined with three different constructive heuristic methods for generating the initial population. Experimental results show that, BBO can obtain good clustering solutions for all of the constructive heuristic methods; also, it outperforms six of eight approaches that available in the literature. This experiment shows that different constructive heuristics do not lead to different quality of the final clustering solution; this is due to the good information sharing mechanism between solutions during the search process in BBO. In order to improve the results obtained by BBO, it hybridised with Great Deluge (GD) algorithm (coded as BBO-GD), where the GD is considered as a local search algorithm that used in BBO-GD approach as additional step to intensify the search around the solutions found by BBO. Experimental results show that BBO-GD outperforms all approaches that used in the comparison with the first approach. Later, to handle data with different properties and shapes, the BBO-GD is reformulated as a multi-objective clustering approach, since the single objective clustering approach is not suitable for all types of data. In this stage, two conflicting clustering criteria (compactness and connectedness of cluster) are used to drive the search process. A multi-objective BBO-GD (coded as MO-BBO-GD) obtained a high quality clusters for various datasets with different data properties. Finally, a further improvement is carried out by embedding a variable population size mechanism to the MO-BBO-GD (coded as MO-BBO-GD-VPS), where the population size parameter is auto-tuned based on the status of the search process. This is needed in order to reduce computational time. Experimental results show that MO-BBO-GD-VPS is able to produce better results in less computational time. As conclusion, MO-BBO-GD-VPS can be considered as a good clustering tool that capable to cluster data with different properties.,Ph.D |
Pages: | 277 |
Call Number: | QA278.H3385 2015 3 |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.