Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/513450
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Aziz Deraman, Prof. Dato' Dr. | - |
dc.contributor.author | Jaradat Ashraf G.H. (P36170) | - |
dc.date.accessioned | 2023-10-16T04:36:49Z | - |
dc.date.available | 2023-10-16T04:36:49Z | - |
dc.date.issued | 2015-04-15 | - |
dc.identifier.other | ukmvital:82082 | - |
dc.identifier.uri | https://ptsldigital.ukm.my/jspui/handle/123456789/513450 | - |
dc.description | Information integration (II) is the general process of producing a unified repository from a set of multi-heterogeneous and autonomous sources that may contain (semi)-structured or unstructured data. Moreover, entity resolution (ER) with imperfection management has been accepted as a major aspect while integrating multi-information spaces that exhibit entities in varied identifiers, abbreviated names, and multi-valued attributes. This signifies the issues of starting with imperfect data to the production of probabilistic database. However, a large number of the existing ER approaches are not designed to fit well to this incorporation. This work, therefore, aims to study the ER as a non-trivial problem within two entity linkage and data fusion tasks over multiple possible imperfect entities correspondences, and multi-valued attributes. Specifically, it is designed to achieve three main objectives: proposing a best-effort integration framework to cope with the probabilistic nature of the ER process and outputs, constructing a probabilistic resolution model, and proving the feasibility of the proposed model. This covers a mix methods research approach through four main methodology phases: literature review, problem specification and framework proposal, model construction, and prototype implementation and evaluation. As a result a proper modelling solution methodology that copes well with the addressed problems under the information integration frame is introduced. It comprises of three activities: i) a new best-effort integration framework is proposed and three functionality resolution challenges are outlined. In this proposal, the probabilistic instance integration element is added to the framework structure, the informative digital object (iDO) modelling concept is used, and the pair-wise-source-to-target process is performed in sort of a chain of separated stages. ii) A probabilistic ER model, called PrinDO is constructed. It is based on two elements; the probabilistic formulations and computations to represent and quantify the linkage, merging, and fusion outputs to create a new probabilistic global entity that contains a set of ordinary merged entities alternatives. The second element is the iDO concept to obtain domain independent resolution rules, reduce the uncertainty of the generated possible-worlds, and establish the network digital object (nDO) concept that depicts the global entities. iii) A prototype named imperfect instance management and integration (Impiana-I) is developed, and two real-world datasets experiments are taken to assess the feasibility of the proposed model. With respect to the findings, it can be concluded thoroughly that the model is found technically and practically capable in managing imperfect data and obtaining true resolution answers with low-cost and without any manual intrusion or valuable information lost. It managed to obtain satisfactory and comparable quality results against precise approaches. The results showed improvement when considering more than one alternative for the final answer; F-msre results for highest alternative were respectively 0.957 and 0.946 for one of the Restaurant and Cora experiments. These results were improved as the second alternatives for possible resolution answers are taken; they respectively became 0.987 and 0.958. This model can be extended to address the imperfection challenges at the schema, query, and global indexing, and for different information integration applications. It also can complement the traditional precise resolution approach, when human intervention is needed or conflicting results are observed.,Ph.D. | - |
dc.language.iso | eng | - |
dc.publisher | UKM, Bangi | - |
dc.relation | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat | - |
dc.rights | UKM | - |
dc.subject | Decision making -- Mathematical models | - |
dc.subject | Universiti Kebangsaan Malaysia -- Dissertations | - |
dc.subject | Dissertations, Academic -- Malaysia | - |
dc.title | Best effort resolution model for imperfect instance management using probabilistic network digital object approach | - |
dc.type | Theses | - |
dc.format.pages | 297 | - |
dc.identifier.callno | QA273.J347 2015 3 tesis | - |
dc.identifier.barcode | 001735 | - |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_82082+SOURCE1+SOURCE1.0.PDF Restricted Access | 11.51 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.