Stemming algorithm for different tenses to improve Persian dictionary

Arash Ghazvini

Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/476533

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Tengku Mohd Tengku Sembok, Prof. Dr.
dc.contributor.author	Arash Ghazvini
dc.contributor.other	P56178	-
dc.date.accessioned	2023-10-06T09:20:30Z	-
dc.date.available	2023-10-06T09:20:30Z	-
dc.date.issued	2012-08-06
dc.identifier.other	ukmvital:117477
dc.identifier.other	P56178	-
dc.identifier.uri	https://ptsldigital.ukm.my/jspui/handle/123456789/476533	-
dc.description	Persian language is an Indo-European language that is known for its complexity due to the morphology structure. Persian language consists of a variety of tenses, while the focus of this research project is on Past subjunctive, Past perfect, Continuous past, Present perfect and Past simple. A verb may appear in different forms in a sentence depending on the person, number, gender, tense, mood and the occurrence of certain roots. Development of Persian stemming algorithm for the above-mentioned tenses and their impact on improvement of Persian dictionary is presented in this research project. The main problem is to design a Persian stemming algorithm to remove affixes from the verb in order to extract its root and convert it to infinitive of the verb. The objective of this research project is to design a Persian stemming algorithm for the mentioned tense verbs and to implement the proposed algorithm to get result for the meaning of infinitive in Persian dictionary. This dictionary is then tested to prove higher accuracy compared to the conventional Persian dictionaries which do not show results for affixed verbs since all verbs are stored as an infinitive in the database. Upon a search, the system will show a null result if the word is an affixed verb and will then process the word using Finite State Automata by the stemming process to remove the affixes and find the root of the verb; the root will then be converted into an infinitive and the search will occur again within the database. According to the findings, implemented Persian stemming algorithm based dictionary is fully accurate for the regular verbs in mentioned tenses which are formed from their infinitive based on general grammatical rules.,Tesis ini tiada perakuan deklarasi pelajar
dc.language.iso	eng
dc.publisher	UKM, Bangi
dc.relation	Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat
dc.subject	Computational linguistics
dc.subject	Computer algorithms
dc.subject	Natural language processing (Computer science)
dc.title	Stemming algorithm for different tenses to improve Persian dictionary
dc.type	theses
dc.rights.holder	UKM	-
dc.format.pages	107
dc.identifier.callno	QA76.9.A43G483 2012 3 tesis
dc.identifier.barcode	002596 (2012)
Appears in Collections:	Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:

File	Description	Size	Format
ukmvital_117477+SOURCE1+SOURCE1.0.PDF Restricted Access		2.21 MB	Adobe PDF	View/Open

Show simple item record Recommend this item