Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/513389
Title: | Enhanced memetic differential evolution optimisation algorithms for data clustering problems |
Authors: | Hossam Moh'd Jabr Mustafa (P81300) |
Supervisor: | Masri Ayob, Prof. Dr. |
Keywords: | Universiti Kebangsaan Malaysia -- Dissertations Dissertations, Academic -- Malaysia Algorithms Data sets Memetics |
Issue Date: | 15-Aug-2019 |
Description: | The performance of data clustering algorithms depends mainly on their ability to balance between the exploration and exploitation of the search, and the effectiveness of outlier detection techniques. Although recent single criterion data clustering algorithms have achieved reasonable quality solutions for some datasets, their performance across real-life datasets could be improved. Moreover, most of these data clustering algorithms adopt a single criterion optimisation approach, which often fails to find good data clustering solutions for a wide diversity of datasets with different cluster characteristics. A multi-objective meta-heuristic approach is sometimes been utilised to address this issue, which seeks an optimal clustering solution by maximising or minimising more than one objective functions. Some of these data clustering algorithms (e.g. differential evolution (DE), particle swarm optimisation (PSO) or non-dominated sorting genetic algorithm (NSGA-II)) find good quality solutions for some datasets, but fail to attain good results across all datasets. These shortcomings could be caused by the challenges of balancing exploration and exploitation, which may lead to premature convergence, stagnation or weak diversity in the pareto-front solutions. Moreover, the design of these clustering algorithms is usually developed using distance measures. These algorithms may experience challenges in identifying data points that are either noise or outlier. Three memetic differential evolution algorithms are proposed to overcome the shortcomings mentioned above. The research first proposes a single criterion memetic differential evolution optimisation algorithm (MADE). The memetic algorithm (MA) employs an adaptive DE mutation operator. Such a combination expected to improve the convergence and gain a better balance between exploration and exploitation. The experimental results, based on several real-life benchmark datasets taken from the UCI repository, show that MADE outperformed other competing algorithms. Next, the research introduces a multi-objective memetic differential evolution algorithm (MOMDE) for data clustering. The MOMDE combines the memetic differential evolution algorithm with the dominance-based multi-objective approach, in order to improve the search for optimal clustering by maximising or/and minimising two cluster quality measures for many datasets. Finally, the research proposes an enhanced MOMDE algorithm (eMOMDE) based on the local outlier factor (Conn_LOF), which aims to improve the performance of the connectivity measure of objective function by eliminating the outliers. The experiments based on real-life datasets from the UCI machine learning repository and synthetic two-dimensional datasets showed that the MOMDE and eMOMDE algorithms outperformed other compared data clustering algorithms. The external validity is evaluated using the F-measure to evaluate the accuracy of the obtained clustering, whilst the multi-objective performance assessment metrics is used to evaluate the quality of Pareto-optimal sets such as convergence, diversity, coverage, and overall non-dominant vector generation. Generally, in most of the cases, the proposed algorithms significantly outperformed recent researches when tested on standard benchmark datasets. This indicates that the combination between the adaptive DE mutation strategy, local search, multi-objective optimisation, and handling outliers within the clustering criterion can enhance the performance of the MA in solving the data clustering problems for different kinds of datasets.,Ph.D. |
Pages: | 225 |
Call Number: | Z692.D37M837 2019 3 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_122135+SOURCE1+SOURCE1.0.PDF Restricted Access | 35.16 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.