Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/476474
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorNazlia Omar, Prof. Madya Dr.
dc.contributor.authorRidha Hussein Chassab (P80595)
dc.date.accessioned2023-10-06T09:19:08Z-
dc.date.available2023-10-06T09:19:08Z-
dc.date.issued2017-04-18
dc.identifier.otherukmvital:99003
dc.identifier.urihttps://ptsldigital.ukm.my/jspui/handle/123456789/476474-
dc.descriptionResolution of anaphoric reference is considered as one of the most challenging tasks in the field of natural language processing. Numerous related studies have mainly focused on the creation of automatic mechanisms for resolving pronouns. It is quite challenging to offer a broad, reasonable and calculable account of how the resolution process is functioning, since human themselves handle it subconsciously and mostly not aware of its particularities. The pronoun it can be misleading when it appears in the context, as this pronoun can be either referring to a previous entity in the context, which is the normal case or there are those that do not relate to any subject. These types of pronouns that do not have an antecedent, are often referred to as being non-anaphoric or non-referential. They pose a serious issue for researchers implementing anaphora resolution algorithms, which may cause unnecessary attempts to find referents that will produce inaccurate results. Numerous researchers addressed this issue yet underrated the problem of non-referential pronouns by introducing manual solutions to handle these pronouns separately. Existing solutions, however, in terms of machine learning approaches, are determined by a limited set of features. This limitation makes it necessary to utilize more categories of features to achieve satisfactory results. Therefore there is a genuine need to address these issues which will increase the performance and accuracy of all related anaphora resolution mechanisms. The main aim of this study is to design and develop a model capable of classifying non-referential pronoun it in a given text. To detect the usage of both referential and non-referential pronouns in the context, the research proposes a four phases methodology, including dataset acquisition, preprocessing, classification and evaluation. Three general features categories are proposed words lemmas, POS tags and named entity. The Q-A corpus is used as a dataset for its full labeled data with 900 question and answer pairs sampled from the Yahoo Q&A corpus. Support Vector Machines (SVM) is used as the machine learning classifier for this model. The result which is 86% accuracy shows that this model is capable of both which are identifying referential and non-referential pronoun it in a given text.,Certification of Master's/Doctoral Thesis" is not available
dc.language.isoeng
dc.publisherUKM, Bangi
dc.relationFaculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat
dc.rightsUKM
dc.subjectNatural language processing
dc.subjectComputational linguistics
dc.titleA machine-learning approach to anaphora resolution of non-referential pronounit" for question answers pairs"
dc.typetheses
dc.format.pages84
dc.identifier.callnoP98.C483 2017 3 tesis
dc.identifier.barcode003243(2018)
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
File Description SizeFormat 
ukmvital_99003+SOURCE1+SOURCE1.0.PDF
  Restricted Access
200.34 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.