Please use this identifier to cite or link to this item:
https://ptsldigital.ukm.my/jspui/handle/123456789/500010
Title: | Three dimensional protein motif analyses using graph theoretical algorithms |
Authors: | Nurul Nadzirin (P58671) |
Supervisor: | Mohd. Firdaus Mohd. Raih, Prof. Madya Dr. |
Keywords: | Protein structure analysis Three dimensional Proteins |
Issue Date: | 18-Dec-2016 |
Description: | This thesis documents the development of three graph theoretical systems for applications on protein structures. The systems involve three programs: ASSAM, compares a protein motif to different databases of protein structures; SPRITE compares a protein structure to different databases of protein motifs; and IMAAAGINE allows hypothetical arrangements of three dimensional protein motifs to be searched in different databases of protein structures. These programs could find useful applications in providing annotations to the current set of proteins of unknown function in the Protein Data Bank (PDB), primarily via the discovery of potential functional sites in the structures. The research objectives are: (i) to compile an inventory of true uncharacterized proteins, (ii) to develop processes to compare 3D motifs, (iii) to develop a process to search for hypothetical 3D motifs, and (iv) to analyze the motifs discovered using the processes developed. The systems for SPRITE, ASSAM and IMAAAGINE were built by: (i) compiling the structures to be used in the databases, (ii) converting the structures or motifs to be represented as a graph, and finally (iii) developing a pipeline to perform the comparison and searching process. Web servers for all three programs were developed and published for free online use by the public. Following the development, datasets were collected to probe their functionalities. Proteins of unknown function in the Protein Data Bank (PDB) were first assessed to determine several methods to provide evidence for their potential function(s). Well-established sequence and structure comparison methods, i.e. BLAST and Dali, were used to relate this set of 2549 proteins with characterized proteins. It was discovered that the annotation carried out by the PDB lags behind the rate at which protein functions are discovered experimentally. It was also discovered that a significant number of uncharacterized proteins in the PDB qualify to be re-annotated due to the emergence of functionally characterized homologs after the date they were deposited. This result constitutes a relatively easy way to provide annotations for these proteins. Consequently, the set of ‘true’ proteins of unknown function in the PDB was ascertained to consist of 1084 proteins, whereas a further 597 proteins with some degree of homology with characterized proteins were compiled into a different dataset. These datasets were then subjected to analyses using SPRITE and ASSAM. Several interesting cases were focused on, and additional analyses were performed to corroborate the hypothesis that these are functional or important motifs. Protein motifs shared by different proteins could be either macromolecular homologs that have been conserved by related proteins through time, or macromolecular analogs that have arisen from unrelated proteins to perform a certain function. Both groups of motifs were identified and are documented in this thesis. Additionally, a search and analysis of motif triplets was carried out using IMAAAGINE. Through the analyses that have been carried out, the graph theoretical programs were demonstrated to be both applicable and important to be added to already available programs for protein structure analysis.,Certification of Master's/Doctoral Thesis" is not available |
Pages: | 168 |
Call Number: | QD431.N837 2017 tesis |
Publisher: | UKM, Bangi |
Appears in Collections: | Faculty of Science and Technology / Fakulti Sains dan Teknologi |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ukmvital_97902+SOURCE1+SOURCE1.0.PDF Restricted Access | 198.56 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.