Integrating correlation clustering and agglomerative hierarchical clustering for holistic schema matching

Basel Mahmoud Alshaikhdeeb

Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/476625

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Kamsuriah Ahmad, Dr.
dc.contributor.author	Basel Mahmoud Alshaikhdeeb
dc.contributor.other	P59517	-
dc.date.accessioned	2023-10-06T09:22:39Z	-
dc.date.available	2023-10-06T09:22:39Z	-
dc.date.issued	2014-04-15
dc.identifier.other	ukmvital:122171
dc.identifier.other	P59517	-
dc.identifier.uri	https://ptsldigital.ukm.my/jspui/handle/123456789/476625	-
dc.description	With the dramatic growth of heterogeneous data sources accessible over the Web, data integration has become more significant in the field of data warehouses, semantic web and e-commerce. The purpose of data integration is to provide a unified view over various heterogeneous sources. Schema matching plays an essential role in data integration by finding semantic correspondences between elements of two schemas. Recently, large schema matching, which is matching many schemas concurrently rather than pair-wise matching, has brought considerable attention. Holistic schema matching has become challenging in the field of large-scale schema matching by taking various number of schemas as input and finding the similarities among them. However, matching many input schemas may consume longer time and may produce poor quality matching. Therefore, reducing a large search space, in terms of achieving more accurate matching has become a challenging issue. Many approaches have been proposed in order to overcome the search space reduction using several clustering techniques whether partitioning clustering such as, k-means and k-methods or hierarchical clustering such as, agglomerative and divisive. However, the current approaches still have some drawbacks and need for improvement in terms of its performance. Thus, this research proposed an improved integrated clustering method in order to reduce the search space avoiding randomly initial solutions which leads to effective matching for holistic schema. The proposed clustering method is an integration of Correlation Clustering and Hierarchical Agglomerative clustering which works on maximizing the dissimilarity and minimizing the similarity between Interclusters in order to produce the initial solutions and then match the correspondences attributes toward its relevant clusters. Furthermore, a pre-processing phase has been implemented including domain dictionary and auxiliary information (such as synonyms and abbreviation). The experiments are carried out on the Airfare, Auto and Book data sets from UIUC Web Integration Repository. Each data set contains of 20 web interfaces. The results of experiments are compared with other matching approaches. It shows that Airfare, Auto and Book achieve accuracy of 0.9, 0.93 and 0.9 respectively. The proposed method can contribute towards more effective and auspicious result in solving holistic schema matching.,Master of Information Technology,Certification of Master's / Doctoral Thesis" is not available"
dc.language.iso	eng
dc.publisher	UKM, Bangi
dc.relation	Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat
dc.subject	Data integration (Computer science)
dc.subject	Statistical matching
dc.subject	Cluster analysis
dc.subject	Universiti Kebangsaan Malaysia -- Dissertations
dc.subject	Dissertations, Academic -- Malaysia
dc.title	Integrating correlation clustering and agglomerative hierarchical clustering for holistic schema matching
dc.type	theses
dc.rights.holder	UKM	-
dc.format.pages	78
dc.identifier.callno	QA76.9.D338A457 2014 3 tesis
dc.identifier.barcode	005561(2021)(PL2)
Appears in Collections:	Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:

File	Description	Size	Format
ukmvital_122171+SOURCE1+SOURCE1.0.PDF Restricted Access		16.43 MB	Adobe PDF	View/Open

Show simple item record Recommend this item