Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/476533
Title: | Stemming algorithm for different tenses to improve Persian dictionary |
Authors: | Arash Ghazvini (P56178) |
Supervisor: | Tengku Mohd Tengku Sembok, Prof. Dr. |
Keywords: | Computational linguistics Computer algorithms Natural language processing (Computer science) |
Issue Date: | 6-Aug-2012 |
Description: | Persian language is an Indo-European language that is known for its complexity due to the morphology structure. Persian language consists of a variety of tenses, while the focus of this research project is on Past subjunctive, Past perfect, Continuous past, Present perfect and Past simple. A verb may appear in different forms in a sentence depending on the person, number, gender, tense, mood and the occurrence of certain roots. Development of Persian stemming algorithm for the above-mentioned tenses and their impact on improvement of Persian dictionary is presented in this research project. The main problem is to design a Persian stemming algorithm to remove affixes from the verb in order to extract its root and convert it to infinitive of the verb. The objective of this research project is to design a Persian stemming algorithm for the mentioned tense verbs and to implement the proposed algorithm to get result for the meaning of infinitive in Persian dictionary. This dictionary is then tested to prove higher accuracy compared to the conventional Persian dictionaries which do not show results for affixed verbs since all verbs are stored as an infinitive in the database. Upon a search, the system will show a null result if the word is an affixed verb and will then process the word using Finite State Automata by the stemming process to remove the affixes and find the root of the verb; the root will then be converted into an infinitive and the search will occur again within the database. According to the findings, implemented Persian stemming algorithm based dictionary is fully accurate for the regular verbs in mentioned tenses which are formed from their infinitive based on general grammatical rules.,Tesis ini tiada perakuan deklarasi pelajar |
Pages: | 107 |
Call Number: | QA76.9.A43G483 2012 3 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_117477+SOURCE1+SOURCE1.0.PDF Restricted Access | 2.21 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.