Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/463484
Title: A combined method of utilized features and K-means clustering approach for biomedical named entities recognition
Authors: Eiman Nowaem Aloffi (P65618)
Supervisor: Sabrina Tiun, Dr.
Keywords: K-means clustering
Classify drugs
Biomedical entities
Dissertations, Academic -- Malaysia
Issue Date: 14-Jun-2015
Description: Biomedical Named Entity Recognition (BNER) is the field of extracting biomedical names such as diseases, chemical compounds, genes and proteins. Recently, drugs' names recognition has caught several researchers' attentions regarding to its necessity. However, most of these efforts have proposed supervised learning (SL) techniques. Basically, SL techniques require a predefined class label which is difficult to estimate when dealing with drugs' names where tremendous types of entities are included. This study aims to address the capability of unsupervised learning in terms of extracting drugs' names by proposing a combined method of utilized features and k-means clustering approach. The corpus that has been used in this study is SCAI. Multiple preprocessing tasks have been performed in order to eliminate the irrelevant data. The proposed features consists of affixes and POS tagging. Affixes aims to exploit the nature of drugs' names that usually contains prefixes (e.g. Abraxane, Abreva, Ablavar, etc.) or suffixes (e.g. Tramadol, Panadol, Themadol, etc.). Whereas, the POS tagger that has been used aims to provide a tag for each word such as noun, verb or adjective which facilitates the identifying of drugs and chemical compounds names. K-means clustering approach has been used in order to classify drugs based on the semantic similarities using three similarity measures; Cosine, Dice and Jaccard. The proposed method has been evaluated using the standard information retrieval metrics; Precision, Recall and F-measure. The experimental results of the proposed method have demonstrated an enhancement by obtaining 90% of F-measure. Future direction for this research can be represented by utilizing further features in order to extend the process of extraction to include more biomedical entities.,Certification of Master's/Doctoral Thesis" is not available
Pages: 65
Publisher: UKM, Bangi
Appears in Collections:Faculty of Science and Technology / Fakulti Sains dan Teknologi

Files in This Item:
File Description SizeFormat 
ukmvital_82212+SOURCE1+SOURCE1.0.PDF
  Restricted Access
424.36 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.