1.1 Abstract
Text extraction from natural settings offers a variety of information that is frequently used to comprehend image scenarios and to recover visual data. This scene image's semantic information is extremely helpful for helping people understand their surroundings as a whole. The text identification and character recognition procedures are made more difficult by the fact that text in natural photos shows a highly varied look in an unrestricted setting. Text extraction from images refers to the process of automatically extracting text information from images. This can be challenging due to variations in font, size, orientation, and background. However, recent advances in computer vision and deep learning have led to the development of efficient and accurate text extraction methods. Sentimental analysis involves analyzing and understanding human emotions and opinions from text data.
This can be useful for a variety of applications, such as predicting customer satisfaction, analyzing social media sentiment, or understanding public opinion on a particular topic. Recent advances in natural language processing and deep learning have led to the development of highly accurate and efficient sentimental analysis methods. While object identification and tracking involves identifying and tracking objects of interest in a video stream or series of images. This can be used for various purposes, such as tracking vehicles in a traffic scene, monitoring the movement of people in a public space, or tracking animals in wild. DL-depended methods have been shown to be highly effective in this task.
This thesis offer reviews of current methods for text extraction, sentiment analysis, object recognition, and tracking studies. This reviews covers a variety of text extraction presentations, such as text-based page segmentation, image indexing, document retrieval, object identification, nameplate recognition, document coding, video content analysis, street signs, and text-based video indexing. It also covers topics like document examination, object identification, and object tracking. The literature survey also provides an analysis of the benefits and limitations observed in the reviewed studies, which in turn serves as motivation for the proposed research work.
This thesis work develops a model uses a weighted Naive Bayes Classifier (WNBC) depended DL algorithm to efficiently discover text and identify the character from natural scene photos to avoid the above mentioned issues. The Guided Image Filter (GIF), which is introduced during pre-processing, is typically used to eliminate noise from natural scene photographs. The techniques of Stroke Width Transform (SWT) and Gabor transform (GT) are utilized to extract the characteristics necessary for the classification process. Eventually, using these extracted characteristics, WNBC and Deep Neural Network centered Adaptive Galactic Swarm Optimization (DNN-AGSO) successfully detects text and recognizes characters. Finally, the accuracy, precision, F1-score, and recall metrics are calculated to assess performance of the suggested technique. The suggested strategy outperforms other known strategies when tested on the IIIT5K dataset.
Furthermore, twitter data and natural language processing (NLP) techniques are used to analyze the public discourse and sentiments related to the COVID-19 pandemic. The COVID-19 epidemic has resulted in a large number of fatalities and is a public health disaster unlike any other. Researchers now have the chance to examine how the public is responding to the outbreak due to the Twitter, which has grown into a significant venue for public interactions. Through sentiment analysis, text analysis, and theme analysis of 100,000 tweets, researchers gain insights into public responses and identify 30 issues categorized into themes such as Public Health, COVID-19 on a global scale, and number of Cases/Deaths.
This study demonstrates the potential of real-time analysis using Twitter data to combat misinformation and provide accurate guidance. The researchers analyzed 100,000 tweets by searching for hashtags such as #coronavirus, #coronavirusoutbreak, #coronavirusPandemic, #COVID-19, #epitwitter, #ihavecorona, #StayHomeStaySafe, and #TestTraceIsolate. The analysis conducted using programming languages such as Python, Google NLP, and NVivo involved sentiment analysis, text analysis, and theme analysis of the tweets. The results showed that 29.56% of the tweets were classified as positive, 29.54% as mixed, 23.23% as neutral, and 18.069% as negative. The frequently used phrases were "cases," "home," "people," and "assistance." In this study, 30 issues were identified and categorized into three themes: Public Health, COVID-19 on a global scale, and number of Cases/Deaths. The study showcases the potential of utilizing Twitter data and NLP techniques to investigate public discourse and opinions related to the COVID-19 pandemic. Real-time analysis can reduce misinformation and enhance the effectiveness of providing accurate guidance to individuals.
Lastly, a case study is conducted to explore object tracking, categorization, and autopilot guidance for IoT applications using image processing approaches, specifically considering the passive homing missile scenario. In addition, the thesis includes case studies that present existing models for object identification and object tracking in image processing, accompanied by thorough mathematical analysis to enhance the understanding and evaluation of these models.
So the motivation of this thesis work is to overcome the challenges like lighting conditions, shadows, environmental impacts on images while extracting the text from images. Then the text that is extracted from scene images is typically used for things like finding things, guiding tourists, reading licence plates, assisting the blind person, text image sentiment tagging based on COVID-19 tweets, study in various applications like object tracking, categorization using image processing approaches for autopilot guidance in IoT, etc. An effective DL text extraction, object identification and classification strategies is provided based on the aforementioned techniques.
• Some of the special contributions include in this thesis are as follows:
Text extraction from scene images is now widely used in various real-time applications. The texts that are extracted from scene images are normally used in various applications like attaining the location information, guiding tourists, detecting the license plate, visually impaired people, developing understanding about the COVID-19 response by twitter users by using a text analysis approach etc. An effective deep neural network is introduced in this method for text extraction. Before that, the image needs to be analyzed to identify whether the given image carries any information or not. To accomplish that, a WNBC approach is introduced before the deep learningbased character recognition. WNBC classifies the image as textual image and non-textual image.
The misclassification error that occurs during text classification process is reduced by EPO algorithm. Finally, the images that carry the information/text are given as input for DNN-AGSO for text extraction. Furthermore, during text extraction, the weight parameter in DNN also degrades the accuracy rate of DNN. To avoid such circumstances, an adaptive optimization is hybrid along with DNN which selects optimal weight parameter for DNN. Due to this, the accuracy of whole extraction process gets improved. Further to maximize the performance of GSO, the fuzzy logic is adapted along with this GSO, which enhances the optimal weight selection in DNN. Because the selection of optimal weight parameter is essential to improve the accuracy rate of DNN with less misclassification rate. Using these porposed methdology, the following objectives are proposed to be addressed.
To identify basic details about text extraction model and basic mathematical models
related to the proposed and compared existing methods.
A system for image enhancement and edge smoothing along with high grade artefact removal quality has been developed.
The segmentation process on the image is carried out using watershed algorithm to locate objects and boundaries in images
A novel feature sets extraction methodology have been developed to using Gabor Transform (GT) based texture feature extraction and another one is Stroke Width Transform (SWT).
Modelling of text is to be performed as a combination of stroke components with a variety of orientations, and features of text can be extracted from combinations and distributions of the stroke components.
A multilayer weighted naive bais classifier has been developed for recognition of
text or non-text portions in the images. the error produced from the classifier is to be minimized with the EPO algorithm with updating weight in the weighted Naïve Bayes algorithm.
A novel deep neural network based model is developed for intelligent character recognition and labelling recognition model for recognition of scene text has been proposed. The extracted text regions become input of DNN with AGSO model after splitting the image into image with different patches and finally the miss classifications are removed with the Manhattan distance.
To enhance the accuracy of the system deep neural network with adaptive galactic swarm optimization for image text extraction is proposed.
An architecture is designed for achieving efficient performance text extraction for understanding COVID-19 response by twitter.
An optimization of target classification tracking and mathematical modelling for control of autopilot is also presented in thesis as a case study of suggested method
1.2. Motivation
In contrast to text extraction, where it is easy to differentiate between non-text and text regions and to isolate each character from its environment, text detection is a difficult and complex task. This makes the task of automatically extracting text more challenging. The lighting conditions are another important aspect that makes text recognition and detection in natural environments challenging. While varied backgrounds are typically acquired inoutdoor photographs, lighting conditions and shadows can have an impact on how an imageis illuminated. This makes the task of automatically extracting text more challenging.
So adequate filtering methods are required. For text detection, sliding windows are used in many methods. These methods take the original images sub images and determine whether they include text or not. An effective classifier that can extract the text or non-text portion from the images of natural scenes with a low rate of misclassification error is needed to stop this repeating procedure. The text that is retrieved from scene images is typically used for locating things, directing tourists, seeing licence plates, helping the blind, etc. In this strategy, a powerful DL technique is introduced for text extraction. In order to find whether the provided image covers any information or not, the image must first be analysed. Text detection in natural scenes is a challenging task and more complicated than text extraction in document text images, where there is a clear distinction between separating text or non-text regions and each character is separated from the context.
In natural scenes, text can be appear in numerous states; dark text in light background and vice versa, with wide variety of fonts, even for characters of the same word, part of words can be overlapped by object of the environment and as a result the detection of these parts can be impossible. Other factors, like camera settings, may cause blurry images or perspective distortions. A major factor that makes the text detection and recognition in natural scenes difficult, are the illumination conditions. So it needs proper filtering technique. The light of the environment may create reflections on the text surfaces, object of the environment may cast shadows on the text surface, and also the intensity of the objects depends on the light source. The effect of the illumination conditions on text detection has been also reported. A lot of the text detection approaches make use of sliding window. They extract sub images from the original image and they evaluate them as text or non-text. They also repeat this with sliding windows of different scales. This is estimated to need efficient classifier evaluation with minimum error.
1.3. Objective and Scope of Thesis
Research objectives of the thesis is given below:
Some study on text extraction from complex dredged image and application of text extraction algorithms had been done prior to this thesis. Joan, S.F., et al have do a survey on text information extraction from born- digital and scene text images. The text extraction was first studied to better understand how text extraction and recognition performed, and then to better comprehend various types of the text extraction methodologies. The performance of existing text extraction techniques were investigated, including the khlif,zhu,zhang, R-FCN, Faster RCNN etc. employing the Convolutional neural network, adaptive color reduction. It was revealed character recognition and text identification process is made more difficult by the fact that text in natural images shows a highly varied appearance in an unrestricted setting. However neural networks substantially improved scene text extraction performance with the advent of deep learning (DL). DL-based scene text extraction processes are driven by a number of major reasons.
New potential methodology for text extraction and recognition have been designed using. a weighted Naïve Bayes Classifier (WNBC) based deep learning process is used in this framework to effectively detect the text and to recognize the character from the natural scene images. Normally, the natural scene images may carry some kind of noise in it, to remove that the Guided Image Filter (GIF) is introduced at the pre-processing stage. The features that are useful for the classification process is extracted using the Gabor transform (GT) and Stroke Width Transform (SWT) techniques. Finally, with these extracted features, the text detection and character recognition is successfully achieved by WNBC and an optimization based deep learning process. Then, the performance metrics such as accuracy, F1-score, precision, MAE, MSE and recall metrics are evaluated using IIIT5K dataset to estimate the efficiency of proposed approach
Based on text extraction methodology, a novel text analysis methodology for Understanding COVID-19 response by twitter users has been suggested. Due to Real time text analysis, it can possible reduce the false messages spreading and increase the efficiency in proving the right guidelines for people during COVID 19 out break. This type of text analysis using character identification and recognition can be helpful for government and healthcare authorities to understand and react to public emergencies. It can also be utilized to ensure trust in the public.
An An optimization of image processing algorithms for target classification tracking have been worked out and presented in this thesis as a case study. A popular approach for such recognition of feature points seems to be the fundamental SURF methodology, which would be characterized by high speed and robustness. Furthermore, this method includes flaws, including limited feature point distinguishing capability as well as inaccurate primary orientation of feature points. This can readily be responsible for a large mismatch and fewer image-matching pairs. In order to sub- stantially increase matched pairings as well as enhance matching effectiveness when identifying items of inter- est, an SURF methodology is enhanced in object detection using cell-based fruit fly optimization.
1.4. Thesis Organization
The thesis is organized Chapter-wise, as outlined below:
Chapter-1 This chapter discussed about the background information about image text extraction and its contribution in various applications. Further, the impact of artificial intelligence over the text extraction processing from images will also be discussed. Moreover, along with this, the details about challenges, problem definition, research objectives, motivation, and thesis outline will also be provided
.
Chapter-2 This chapter discussed about the literature works related to proposed text extraction form images done by different authors. The advantages and limitations of these methods are also discussed. Survey related to deep network based feature extraction, optimization based feature selection and artificial intelligence based text extraction will also carried out.
Chapter-3 In this section, the chapter discussed the basic details about text extraction model and basic mathematical models related to the proposed and compared existing methods. And, finally end with a summary.
Chapter-4 This chapter will discussed about deep neural network with adaptive galactic swarm optimization for image text extraction. The proposed methodology is explained in detailed step by step process. Finally, the chapter is concluded with summary.
Chapter-5 In this chapter, the Text Extraction for Understanding COVID-19 response by twitter and the architecture designed for achieving efficient performance in such application is discussed. Finally, the algorithm based comparative analysis will be performed. Then, the chapter is concluded with summary.
Chapter-6 This chapter discuss , a case study of image process in techniques in object classification, tracking, categorization for autopilot guidance in IOT.
Chapter-7 Conclusion and Future Scope.
• PUBLICATIONS
1. Digvijay Pandey, Subodh Wairya, “An optimization of target classification tracking and mathematical modelling for control of autopilot”, The Imaging Science Journal, Taylor & Francis, pp. 1-16, 2023. DOI: https://doi.org/ 10.1080/13682199. 2023.2169987.
2. Digvijay Pandey, Pandey B.K. & Subodh Wairya. “Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images”, Soft Comput. 25(2): 1563-1580 (2021), Electronic ISSN: 1433-7479, Print ISSN: 1432-7643, https://doi.org/10.1007/s00500-020-05245-4
3. Digvijay Pandey, Subodh Wairya, B Pradhan, Wangma, “Understanding COVID-19 response by twitter users: A text analysis approach”, pp. 1-6, Heliyon (8), 2022.
4. Digvijay Pandey, Subodh Wairya, Sharma, M. et al “An approach for object tracking, categorization, and autopilot guidance for passive homing missiles”, Aerospace Systems (2022): pp. 1-14. Springer Nature Singapore, https://doi.org /10.1007/s42401-022-00150-0.
5. Digvijay Pandey, Subodh Wairya. "A Novel Algorithm to Detect and Transmit Human-Directed Signboard Image Text to Vehicle Using 5G-Enabled Wireless Networks." IJDAI vol.14, no.1 2022: pp.1-11. http://doi. org/10.4018/IJDAI.291084.
6. Digvijay Pandey, Binay Kumar Pandey, Subodh Wairya, et al., “Analysis of Text Detection, Extraction and Recognition from Complex Degraded Images and Videos” Journal of Critical Reviews (JCR) 2020; Vol. 7, Issue 18, pp. 427-433, ISSN 2394-5125, doi: 10.31838/jcr.07.18.63,
7. Digvijay Pandey, Binay Kumar Pandey, and Subodh Wairya, “An Approach To Text Extraction From Complex Degraded Scene”, International Journal of Computational and Biological Sciences (IJCBS), Vol. 1 No. 2 (2020): ISSN 2708-3551.
8. Digvijay Pandey, Binay Kumar Pandey, Subodh Wairya “Study of Various Types Noise and Text Extraction Algorithms for Degraded Complex Image” Journal of Emerging Technologies and Innovative Research, vol. 6, Issue 6. pp. 234-246, June 2019. ISSN: 2349-5162.
9. Digvijay Pandey, Subodh Wairya et al., Int. “Secret data transmission using advance steganography and image compression”, International Journal of Nonlinear Analysis and Applications. Volume 12, Special Issue, Winter and Spring 2021, 1243-1257 ISSN: 2008-6822 http://dx.doi.org/10.22075/ijnaa.2021.5635.
10. Digvijay Pandey., Binay Kumar Pandey, Subodh Wairya “Study of Various Techniques Used for Video Retrieval” Journal of Emerging Technologies and Innovative Research, vol. 6, issue 6, pp.850-853, June 2019. ISSN Number: 2349-5162.
11. Pandey, B. K., Digvijay Pandey, Subodh Wairya, & Agarwal, G. (2021). Deep Learning and Particle Swarm Optimization-Based Techniques for Visually Impaired Humans' Text Recognition and Identification. Augment Hum Res 6, 14 (2021). https://doi.org/10.1007/s41133-021-00051-5.
12. Digvijay Pandey & Subodh Wairya. (2022). Performance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In VLSI, Microwave and Wireless Technologies: Select Proceedings of ICVMWT 2021 (pp. 195-203). Singapore: Springer Nature Singapore.
Content Owner / Guide
Title
Performance Analysis on Text Extraction from Complex Degraded Images
Year Awarded (Blank if Not Awarded)
2023
Type
Doctor of Philosophy
Place of Work
E-Mail
Roll No
16ECE2172
Registration Date
Area of Research
Image Processing