Please use this identifier to cite or link to this item: https://ptsldigital.ukm.my/jspui/handle/123456789/772434
Title: Enhanced loss and activation functions in convolutional neural network for optical character classification
Authors: Hani Nayef, Bahera (P99947)
Supervisor: Siti Norul Huda Sheikh Abdullah, Assoc. Prof. Dr.
Keywords: Universiti Kebangsaan Malaysia -- Dissertations
Dissertations, Academic -- Malaysia
Signal processing
Spectrum analysis
Issue Date: 26-Jun-2022
Abstract: Deep learning techniques like convolution neural networks (CNN) are employed in Text recognition to overcome the processing complexity with the traditional methods. Most handwritten character recognition endures an imbalance of positive and negative vectors. This issue declines CNN performance when adopting activation functions such as Rectified Linear Unit (Relu) and Leaky Relu for the successive deep layers in the architecture. Hence, this study firstly proposes an optimized Leaky Relu (OLRelu) to retain more negative vector units using a proposed CNN architecture with a batch normalization layer to address this weakness. Using El-Sawy, Altwaijry, and VGG16 models, the proposed methods evaluated on five datasets are AHCD, self-collected, HIJJA, MNIST, and AIA9K. The results showed outstanding improvement over the known leaky Relu variants as follows: 98.5% for AHCD, 96.9% for self-collected data, 99.6% for Digits MNIST, 90% for HIJJA data, and 99% for AIA9K data. The proposed CNN architecture with the proposed optimized leaky Relu showed a stable accuracy performance and loss rates between the training, validation, and testing phases. The handwritten character samples have various styles, shapes, and sizes due to the different handwriting styles of the writers and morphological similarities. These characters have similar main character shapes but differ in position and the number of the dot. The common loss functions used for handwritten character recognition, such as Cross entropy and sparse cross-entropy, cannot deal with data samples filled with outliers. An improved Mean Square Error is proposed to overcome the vanishing issue by replacing the total number of samples in the MSE formula with the summation of classes probabilities of the training samples. Three models are applied to test the proposed improved Mean Square Error, the proposed CNN Architecture, El-Sawy model, and VGG16 model with Relu activation function and Softmax classifier. The performance of the proposed CNN model with the improved MSE using self-collected, AHCD, and MNIST showed notable performance using ten-fold cross-validation as follows: 89.6±8.5 with (0.0139) error rate, 96.46±0.22 with (0.0049) error rate, and 99.3%±0.08 with (0.00098) error rate respectively. Handwritten Text recognition from natural images is a difficult task due to the versatility of the image resolution and contrast. The proposed method involved CNN with Relu and OLRelu applying dual Maxpooling and concatenating CNN layers to extract the image features. Long Short-Term Memory encodes both the information forward and backward, which works well with the text line and Text Connector characteristics. The proposed model performance is evaluated using training and validation loss errors on the Mjsynthetic and IAM datasets. The results showed remarkable improvement in recognizing characters and reforming words. The best validation loss rate is 2.09% achieved by the IAM dataset with dual Maxpooling and OLRelu. While with the Mjsynthetic dataset, the best validation loss rate achieved by applying concatenating CNN layers and Relu is 2.2%.
Description: Full-text
Pages: 269
Publisher: UKM, Bangi
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
File Description SizeFormat 
ENHANCED LOSS AND ACTIVATION FUNCTIONS IN CONVOLUTIONAL.pdf
  Restricted Access
5.55 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.