Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/513332
Title: | Bidirectional transfer and prototypical representation with multi-source learning for cross-domain sentiment analysis |
Authors: | Tareq Abdo Abdullah Al-Moslmi (P75013) |
Supervisor: | Nazlia Omar, Assoc. Prof. Dr. |
Keywords: | Natural language processing (Computer science) Computational linguistics Data mining Universiti Kebangsaan Malaysia -- Dissertations |
Issue Date: | 30-Jun-2018 |
Description: | The fast-growth of web opinion data has led to the need for automatic tools to analyse and understand peoples’ sentiments toward different topics. Within the context of real word environment, sentiment is expressed differently in various domains. The meaning of sentiment-bearing words also vary between domains. One of the main problem is the lack of labelled data in the target domain which is necessary to carry out precise in-domain sentiment classification. Cross-domain sentiment analysis is the task of adapting a learner trained on data from domains with sufficient labelled data to predict the polarity of the opinion from a scarce labelled domain, without requiring any labelled data for the target domain. In fact, most state-of-the-art cross domain sentiment analysis methods suffer from low performance due to the sparsity problem and the large difference between the source and target domains. Although transferring is utilised, these methods also transfer both noisy and useful sentiment features from source domains, and do not exploit the useful sentiment knowledge in the unlabelled target domain data. To handle these issues, this research proposes new sentiment transfer models for cross-domain sentiment analysis. In order to reduce the features distribution and to ensure the distribution consistency between the source domain and the target domain, this research designed a new sentiment transfer-based model dependent on novel probability co-occurrence measure and bidirectional feature transfer and expansion algorithm. However, there is a need for a mechanism to obviate transferring noisy features (negative transfer) with useful sentiment features, and ensure the suitability of source instances. Consequently, a new Prototypical Opinion Lexical Element (POLE) model for cross-domain sentiment classification was proposed. In many practical cases, the source and target distributions can differ substantially, and in some cases, crucial sentiment features in the target domain may not have support in the source domain. Thus, a new semi-supervised ensemble Multi-source Domain Adaptation (MDA) model is established to bridge the gap between source and target domains by adding both useful sentiment target features and instances. These models are evaluated on the most widely-used standard cross-domain sentiment classification datasets. The results indicate that the new sentiment transfer-based model outperforms some of the other state-of-the-art algorithms, and achieves commensurate results to other algorithms employed in the comparison. In addition, the experimental results show that both the POLE model and the MDA model significantly improve the overall performance of the cross-domain sentiment analysis and outperforms state-of-the-art algorithms.,Ph.D. |
Pages: | 187 |
Call Number: | QA76.9.N38M647 2018 3 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_118687+SOURCE1+SOURCE1.0.PDF Restricted Access | 3.93 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.