Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/519672
Title: | Hybrid discernibility of rough set based algorithms for overlap clustering |
Authors: | Djoko Budiyanto Setyohadi (P51680) |
Supervisor: | Azuraliza Abu Bakar, Prof. Dr. |
Keywords: | Data mining Fuzzy C-Means (FCM) Universiti Kebangsaan Malaysia -- Dissertations |
Issue Date: | 2-May-2014 |
Description: | Overlap Clustering is important in data mining area due to the vague and uncertain of the real-world data. This conditions make the distinction among the classes is not clear and a conventional clustering algorithm often fails to find the appropriate cluster. Fuzzy C-Means (FCM) is the well-known overlap clustering algorithm due to its simplicity and its high performance. The main advantage of FCM is it is more natural since the objects are not forced to fully belong to one of the classes and the overlap clustering is performed by using the degrees of class memberships. FCM relies on distance measurement for membership computation and this leads into two problems. First, it is sensitive to outliers and its performance decrease when the data dimension increase. Second, the use of random initial seed that causes the local optima problem. Recently, Rough K-Means (RKM) algorithm is developed to handle with the vagueness of the data in clustering, and it outperformed FCM when deals with overlap objects using boundary area concept. However, RKM is less descriptive than FCM, since it only separates the vague objects from the crisp objects. Therefore, in this study a hybrid rough set based clustering method is proposed to overcome the overlap clustering. Two algorithms are proposed for overlap clustering problem namely, Discernibility of Rough Set (DR) based algorithm for initial seed computation, and Rough K-Means Discernibility algorithm (RKMD) for overlap clustering. The DR algorithm is proposed to optimize the original RKM clustering, while the RKMD is proposed to compute membership values to handle the overlap objects within boundary area. The DR and RKMD are then hybridized to perform the overlap clustering called DR-RKMD algorithm. The DR-RKMD is then enhanced to validate its performance towards the outlier detection task. A new outlier detection factor is proposed in this work. The aim is to show that the DR-RKMD is able to significantly detect the outliers since the performance of clustering method effects the performance of the clustering-based outlier detection method. The experiments of the proposed algorithms are conducted in three phases. Firstly, the DR is validated using Davies Bouldin index which is performed by simulating the threshold value which influences the capability to control the vague objects. DR shows better performance compared to previous methods since it allows the adjustment of vague objects efficiently. Secondly, the DR-RKMD is validated in terms of the preciseness of the object to be assigned into appropriate cluster, the compactness, and the separation of the cluster. The measures used are Sum Square Error, Dunn index, and Silhouette index. The results show that the proposed algorithm outperformed previous algorithms on to several overlap dataset. The complexity which represents computation cost is also reduced. Thirdly, the improved DR-RKMD for outlier detection with new detection factor produces better detection rate than several previous methods when tested upon several benchmark datasets.,Certification of Master's/Doctoral Thesis" is not available |
Pages: | 167 |
Call Number: | QA278.D566 2014 3 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.