Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/476474
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Nazlia Omar, Prof. Madya Dr. | |
dc.contributor.author | Ridha Hussein Chassab (P80595) | |
dc.date.accessioned | 2023-10-06T09:19:08Z | - |
dc.date.available | 2023-10-06T09:19:08Z | - |
dc.date.issued | 2017-04-18 | |
dc.identifier.other | ukmvital:99003 | |
dc.identifier.uri | https://ptsldigital.ukm.my/jspui/handle/123456789/476474 | - |
dc.description | Resolution of anaphoric reference is considered as one of the most challenging tasks in the field of natural language processing. Numerous related studies have mainly focused on the creation of automatic mechanisms for resolving pronouns. It is quite challenging to offer a broad, reasonable and calculable account of how the resolution process is functioning, since human themselves handle it subconsciously and mostly not aware of its particularities. The pronoun it can be misleading when it appears in the context, as this pronoun can be either referring to a previous entity in the context, which is the normal case or there are those that do not relate to any subject. These types of pronouns that do not have an antecedent, are often referred to as being non-anaphoric or non-referential. They pose a serious issue for researchers implementing anaphora resolution algorithms, which may cause unnecessary attempts to find referents that will produce inaccurate results. Numerous researchers addressed this issue yet underrated the problem of non-referential pronouns by introducing manual solutions to handle these pronouns separately. Existing solutions, however, in terms of machine learning approaches, are determined by a limited set of features. This limitation makes it necessary to utilize more categories of features to achieve satisfactory results. Therefore there is a genuine need to address these issues which will increase the performance and accuracy of all related anaphora resolution mechanisms. The main aim of this study is to design and develop a model capable of classifying non-referential pronoun it in a given text. To detect the usage of both referential and non-referential pronouns in the context, the research proposes a four phases methodology, including dataset acquisition, preprocessing, classification and evaluation. Three general features categories are proposed words lemmas, POS tags and named entity. The Q-A corpus is used as a dataset for its full labeled data with 900 question and answer pairs sampled from the Yahoo Q&A corpus. Support Vector Machines (SVM) is used as the machine learning classifier for this model. The result which is 86% accuracy shows that this model is capable of both which are identifying referential and non-referential pronoun it in a given text.,Certification of Master's/Doctoral Thesis" is not available | |
dc.language.iso | eng | |
dc.publisher | UKM, Bangi | |
dc.relation | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat | |
dc.rights | UKM | |
dc.subject | Natural language processing | |
dc.subject | Computational linguistics | |
dc.title | A machine-learning approach to anaphora resolution of non-referential pronounit" for question answers pairs" | |
dc.type | theses | |
dc.format.pages | 84 | |
dc.identifier.callno | P98.C483 2017 3 tesis | |
dc.identifier.barcode | 003243(2018) | |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_99003+SOURCE1+SOURCE1.0.PDF Restricted Access | 200.34 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.