An Intelligent License Plate Detection and Recognition Model Using Deep Neural Networks

: One of the largest automotive sectors in the world is India. The number of vehicles traveling by road has increased in recent times. In malls or other crowded places, many vehicles enter and exit the parking area. Due to the increase in vehicles, it is difficult to manually note down the license plate number of all the vehicles passing in and out of the parking area. Hence, it is necessary to develop an Automatic License Plate Detection and Recognition (ALPDR) model that recognize the license plate number of vehicles automatically. To automate this process, we propose a three-step process that will detect the license plate, segment the characters and recognize the characters present in it. Detection is done by converting the input image to a bi-level image. Using region props the characters are segmented from the detected license plate. A two-layer CNN model is developed to recognize the segmented characters. The proposed model automatically updates the details of the car entering and exiting the parking area to the database. The proposed ALPDR model has been tested in several conditions such as blurred images, different distances from the cameras, day and night conditions on the stationary vehicles. Experimental result shows that the proposed system achieves 91.1%, 96.7%, and 98.8% accuracy on license plate detection, segmentation, and recognition respectively which is superior to state-of-the-art literature models.


Introduction
Over the past years, Automatic License Plate Detection and Recognition (ALPDR) [1] has played a critical role in the development of smart cities as a monitoring device for car tracking, traffic control, enforcement of strict traffic rules and policies [2]. ALPDR has been commonly used in traffic toll systems, smart parking systems, and many more across several countries. In the parking area, toll gates, or any other crowded places, several vehicles pass by and there is a need for automation where a machine would identify the License plate number of all the vehicles. Hence ALPDR is used in this case where it automates the whole process thus saving human power and this automation would have been impossible without the development of Deep Learning (DL). Recent advances in Deep Learning, particularly neural networks have contributed to the advancement over a range of Computer Vision [3], such as Object Detection [4], Optical Character Recognition (OCR) [5], and identification of objects have led to the development of ALPDR. www.aetic.theiaer.org Deep learning is a branch of machine learning where Artificial Neural Networks (ANN) are algorithms admired by the human brain that learns from large volumes of data [6]. Deep learning helps computers or machines to solve complicated problems even while using a dataset that is very diverse from the one by which the model was trained [7]. Deep Learning predicts and classifies the information by using filters which are passed as inputs through layers. It uses several layers to gradually remove higher-level features from the raw input. Computer Vision [8] is a field of Artificial intelligence that uses digital images in form of videos and images captured from cameras to correctly recognize objects. Computer vision concerns the automated retrieval, interpretation, and comprehension of useful knowledge from a single or series of images. It involves the creation of a theoretical or algorithmic basis for automatic visual comprehension. Computer vision concerns the philosophy behind artificial systems that can extract knowledge from images. Image data can take several types, such as video sequences, multidimensional data, viewing from multiple cameras, etc. Such deep learning models have been effectively used in the field of vehicle automation [9]- [14].
Recognizing a license plate involves detection of the license plate as the first step and then recognize the characters in the license plate [15]. There are several methods to recognize the license plate. Free software called Google Tesseract [16] is available online that can recognize the characters in the license plate. A twostep approach called adaptive recognition is used which finds templates in pixels, words, sentences, and letters. It uses one data stage for recognizing the characters and a second stage to fill out the letters by matching the missed word or sentence. Google Tesseract is pretrained and can be used directly with any model. The proposed approach uses a trained Convolutional Neural Network (CNN) [17], [18] to identify characters in different situations which will be explained at the later stage of this paper ALPDR is an open problem due to its vast diversity of image acquisition status and number plate identity which differs from country to country. ALPDR has been implemented in various places such as smart parking systems [19], tolls, and many more. Still, it faces problems with blurred images, images captured in the night, low-quality images, angle in which the images are captured. Many methods have been proposed, in some cases, three steps as detection, segmentation, and recognition are used. Detection of the license plate is one of the major tasks. Lin et al. in [20] have used You Only Look Once (YOLO) and Support Vector Machine (SVM) together to detect the license plate which provides higher accuracy. In some cases, only YOLO has been used for detection. The proposed method is based on edge detection where the RGB image is converted to a binary scale to check the edges of the license plate, once the edges are detected we make a bounding box over it. Then segmentation of all the characters is done using region props. Once done the characters boxed are recognized using trained Convolution Neural Network (CNN). All the processes are explained in the later stage of this paper.
The remainder of this paper is as follows. Section 2 presents the literature survey of various license plate recognition systems in the past. Section 3 explains the methodology of the proposed system and model architecture. Section 4 highlights the experiments and results performed to evaluate the model generalization in various cases. Section 5 presents the conclusion and future scope of the proposed ALPDR model.

Related Works
The section discusses the state-of-the-art literature in the field of license plate detection, recognition and segmentation. This section is divided into two parts they are license plate detection, and license plate recognition & segmentation.

License Plate Detection
Character recognition, segmentation, image processing, license plate location, and correction are needed for a license plate recognition system. By using the adaptive output algorithm image processing of the license plate is done. To detect the location of the license plate Lin et al. in [21] used a method called a Canny edge operator. To avoid the tilting problem in the license plate perspective transformation algorithm is used. For segmenting good image characters vertical projection method is used and for matching license plates with given characters in the library, the Huasdroff distance algorithm is used by the authors. Number plate detection in vehicles is important in case of any emergency where the police couldn't reach the vehicle www.aetic.theiaer.org or if any vehicle has been stolen. Hsieh et al. in [22] proposed a Real-Time Vehicle Number Plate Recognition System (RVNPR). This system works in real-time where the characters are extracted from the number plate. High definition cameras are used to take pictures of the moving vehicle and then the RVNPR software is used to extract the information from the image. YOLO (You Look Only Once), Segmentation, and last character Recognition are different types of algorithms used to recognize the license plate in a vehicle. Using YOLO the license plate is detected from a vehicle and recognized. The proposed system is tested and proved to be highly accurate in detecting and recognizing the license plate. Many applications are available in License Plate detection and recognition work on a single image that can identify a single license plate in an image. Lin et al. in [23] proposed a method that can detect multiple license plates from a single image. They have proposed a two-stage method based on the deep learning algorithms that can detect all the licenses available in a single image and recognizes the character using character recognition in Convolutional Neural Networks. After evaluating the proposed method's results show that the method achieves 98.23% of license plate detection and 97.38% of character recognition rate. Color based license plate detection is another way to detect the license plate as it is different from the car body color. In [24] Zhang et al. proposed a Sobel-Color Algorithm combined with MSER Algorithm that can locate the color features and Sobel edge in an image. Support Vector Machine (SVM) is used to detect the location of the license plate with much higher accuracy. To segment the characters available in a license plate, the license plate segmentation algorithm is designed. After segmentation of the license plate, the characters need to be identified for that purpose leNet-5 depth model is used to improve the accuracy of character recognition. To avoid the problem of multilingual license plate (LP) detection problem, Kessentini et al. [25] used YOLOv2 for detecting license plates. The second method is used to capture license plate recognition with cropped images. The two datasets used in the system are actual road surveillance and parking access control environments. To minimize the cost and time of processing, a new semi-automatic annotation method for LP images with labelled bounding box components is proposed. When contrasted with other approaches using the public AOLP dataset, the proposed approach outperformed. It gave a 97.67% LP recognition rate in the GAP-LP dataset, and 91.46% in the Radar dataset. For further work, the correlation of data stored on homeland and police stations requires correlation and also enhances the identification of vehicle identity. There are many systems of surveillance available all over the world.

License Plate Recognition & Segmentation
The number of vehicle growth in India is more. Due to an increase in the number of automobiles Automatic Number Plate Recognition (ANPR) is needed. ANPR systems are used in tracing stolen vehicles, tracking vehicles, and traffic examining, whereas implementing that system is a challenging task. Automatic License Plate Detection and Recognition (ALPR) is the most efficient detection and recognition mechanism for intelligent transport systems [9]. Singh et al. in [26] developed a deep neural network model for license plate detection and LSTM based model is used for license plate character recognition. The drawback of this method is it requires high-resolution cameras to capture high-quality images. Omar et al. [27] used a Mask-CNN framework license plate recognition. In the analysis, four databases are used where the images are taken into account with low image quality, blurred image, complex environment, and orientation. The accuracy rates reached on the AOLP and Caltech datasets are 99.3% and 98.9%, respectively. The technique's weakness is that it failed to detect LP and even the framework speed is not impressive. Hendry et al. [28] used a deep learning framework called You Only Look Once (YOLO) -darknet. A YOLO seven convolutional layer sliding window method is used to detect a single class. A sliding window detects the license plate, and a YOLO framework is used to detect each window. The number plate characterization can vary from country to country, as well as in orientation sizes and different languages. Laroca et al. [29] presented an automatic licenses plate recognition system based on the YOLO-CNN model. The proposed system is layout independent and improves the recognition accuracy through post-processing. The system provides a balanced trade-off between speed and accuracy of license plate recognition. The methods proposed by Selmi et al. [30] include identification, segmentation, and recognition using two CNN models. The first CNN is used for classification between plates and non-plates. The second CNN is used for digit segmentation. The images are also taken as poor, good, bright with different viewpoints. Preprocessing is used for character recognition. Several works have been proposed on Automatic License Plate Recognition www.aetic.theiaer.org but the proposed model is limited to their country and does not perform well in other countries. To overcome this problem Henry et al. in [31] proposed a method that consists of three main approaches are License Plate (LP) detection, Unified Character Recognition, and Multinational LP layout detection. For character recognition, YOLOv3 is used that consists of the Spatial Pyramid Pooling (SPP) block. A layer detection algorithm is proposed that can extract the correct sequence of License Plate. The license plate from South Korea, Taiwan, the United States of America, Greece, and Croatia was used to validate the proposed model. The seven numeric plate characters are identified and recognized in this system from the available Brazilian License Plate dataset. Silva et al. in [32] proposed two YOLO-based neural networks, such as (FV / LPD-NET) and (LPS / CR-NET) respectively, which are developed to detect vehicle frontal view and license plate and also to recognize characters. When test 63.18% were correctly detected. It gave a precision of 97.39% for detecting at least five characters. By this form, 93% and 99% of characters are correctly identified and segmented. Car manufacturer/model identification can be performed in future works in the ALPR pipeline. Raju et al. [33] have proposed a method of developing a handheld device that can scan the vehicle's number and inform whether the vehicle belongs to the organization or not by checking in the database. To identify the license plate number Optical Character Recognition (OCR) is implemented in Raspberry pi. To test the handheld device approximately 70 vehicles have been tested including two-wheelers and fourwheelers. The proposed model gave a promising result when the quality of the input image was high and was not much accurate in the case of low-resolution images.
Image processing plays a significant role in numerous real-time applications. In these approaches, the still camera senses images of the number plate. A self-organized CNN function is used in [34] to identify the status of vehicles from number plate pictures. It is achieved with a low level of preparation and 90% accuracy. The actual obstacle that lies here is the real-time implementation of CNN. Dong et al. [35] proposed a fast area CNN method called RCNN. RCNN is used for regressing corner images of the detected plate. Networks and shared-weight recognizers are in the identification stage of parallel spatial transformation. The recognizer eliminates 57.5% of the errors and works well. Rectification is done for seeing distinct viewpoints and STN is done for character segmentation. This approach outperformed other approaches. For high-security surveillance purposes, a system called Vehicle Number Plate Recognition System (VPRS) was tested in tollgates by Rajathilagam et al. [36]. For detecting and recognizing vehicle number plate details, a neural network-based genetic algorithm is used in this approach. The optical characters of vehicles are identified through this algorithm. Table 1 compares the state-of-the-art literature in the field of license plate detection and recognition. The table compares various techniques such as SVM, CNN, YOLO, and so on. It is perceived that the stateof-the-art literature used different combination models for license plate detection and recognition. It is also noticed that few approaches have used image segmentation techniques to extract the license plate characters. Hence, in our approach, we used a three-step process to detect, segment, and recognize the characters. YOLOv3-SPP -Multinational LP layout detection [25] YOLOv2 -Convolutional Neural Networks (CNN) [32] FV/LPD-NET YOLO LPS/CR-NET [26] Faster RCNN Tesseract LSTM Neural Network [38] Sobel Edge Detector Morphological operation The normalized cross-correlation method [39] Template Matching -Template matching and bounding box methods www.aetic.theiaer.org

Proposed System and Architecture
The proposed model is divided into three stages they are License plate detection, Plate Segmentation, and plate recognition. The flow of the three stages is shown in Fig 1.  The image or video is captured and passed to the detection stage to identify the license plate. Then the license plate is passed through the segmentation process where all the characters are segmented. The segmented images are stored in a list to be passed to the CNN model in the recognition stage to identify the letters. The architecture of the proposed ALPDR model is shown in Fig 2. The architecture consists of a camera to capture the image or video of the vehicle to be given as the input to the model. At first, we have the license plate detection module. It identifies the license plate from the input image by comparing it with the bounding box dimensions. Then the detected license plate is given as the input to the segmentation module which segments the characters from the license plate and saves the order of the characters. Finally, the CNN module is used for character recognition that recognizes the characters from the license plate and gives the output. The detailed operation of the proposed ALPDR model is presented as algorithm 1.

Image capturing
The input to this model can be in the form of videos or images. If video is provided as input then the whole video is separated into several frames and then one frame is passed to the model for further steps. The image captured is then passed to the license plate detection module to identify the license plate of the vehicle.

License Plate Detection
Image pre-processing is a crucial step while dealing with images. The captured images are preprocessed according to the dimensions required for the proposed ALPDR model. Then the license plate detection module identifies the license plate from the input images by performing RGB to grayscale conversion and binarization of the image. convert the black pixel to white pixel and vice versa 10: check the width and add it to the list 11: save characters order 12: Recognize the segmented characters using the trained CNN model 13: Prints the recognized characters as output _____________________________________________________________________

RGB to Grayscale Conversion
In a computer vision system, edge detection is considered as an information reduction process that offers region boundary information by filtering out unnecessary information for the next steps of processes. Grayscale image use just one channel of color where only 8 bits is necessary to represent this image whereas in case of coloured images three channels are used where 24 bits are necessary to represent the image. Not in all cases of edge detection grayscale images can be used. In our case, we need to detect the edges of the license plate and the rest of the information is considered as noise and needs to be ignored. The extra information available in colored images can be considered as noise and be removed from the images when using grayscale images. Hence, we use the grayscale image over the colored image in our usecase.

Binarization of the Image
To convert the image containing pixels to a binary image is called binarization, i.e. containing pixels of black and white. The aim of binarization is to detect the edges available in the image. These edges detected in the binary image will be useful to detect the license plate. To create a binary image from a grayscale image Otsu's thresholding algorithm [40] available in scikit-image is used which is the simplest way to segment objects from the background.
Otsu's approach [40] works by iterating over all conceivable threshold values and calculating a measure of spread for the pixel levels on each side of the threshold, i.e. the pixels that either fall in foreground or background. The goal is to obtain the least threshold value for the total of foreground and background spreads. Once the edges in the license plate are detected a red color bounding box is formed over the license plate.
Before the bounding box is formed we check for a condition whether the bounding box predicted is very small (by comparing to the sample license plate dimension that is passed to the model). If the predicted bounding box is very small then we discard it and assume that it could not be a license plate. Once the license plate is detected by satisfying the condition a red color rectangular box is drawn over it as shown in

Characters Segmentation
In the detection step, the license plate is detected through a bounding box over the license plate. In this segmentation stage, the bounding boxes are used over all the characters available in the license plate. This is the important step, only if the characters are segmented properly the model will be able to recognize the characters.
Character segmentation is performed using the existing library functions available in matplotlib. At first, we invert the detected license plate from black pixel to white pixel and vice versa. In this step, we again specify the width of the license plate and ensure that it does not exceeds the specified width. We use region props available in Skimage and patches from matplotlib to make the bounding boxes over the characters as shown in Fig 4. The rectangle class from matplotlib patches is used to segment the specific characters based on the width, height, and rotation angle. We resize the boxed characters and append them to a list so that we don't change the order in which we are predicting the letters.

Character Recognition
The final step is to recognize the segmented characters from the license plate. We train a Convolutional Neural Network (CNN) with two convolutional layers for recognizing the segmented characters. Since the focus is to develop a lightweight model, we proposed the CNN model with a minimum number of layers and parameters. The CNN model can efficiently learn the low level features from the input and can classify it. Hence, it is used in the recognition module of this work.

Convolutional Neural Network Model
To map image data to an output variable, CNNs were designed. They have been so popular that they have been given priority for any sort of prediction issue involving image information as input. The justification for using the Convolutional Neural network is that the important features of an image can be identified automatically without any human supervising the model. CNN's are very effective in the case of classification problems. Hence, the CNN model is used to correctly recognize the segmented images from the license plate. The proposed CNN architecture consists of combinations of 2D convolution and pooling layers at the end we have flatten and dense layers. The proposed CNN model takes input from the segmentation process output. The model accepts the input of size 28x28. Since we feed in individual segmented characters as input 28x28 image dimension is sufficient to recognize the character. The model contains two 2Dconvolution layers. We used max pooling layers of size 2x2 between the convolution layers to extract the low level features from the images. At the end of the model, we used a flatten layer to convert the input dimensions into a single dimension and dense layers at the last for multi-nominal classification. To measure the loss, categorical cross-entropy whereas the optimizer used was Adam. The architecture of the proposed CNN model is shown in figure 4. The original input and the recognized output by the CNN model are shown in Table 2 with different datasets. www.aetic.theiaer.org

Dataset Details
To train and test the model we used 3 different types of datasets extracted from the Chars74k [41] dataset which is commonly used for license plate character recognition. The variations in all the three datasets are as follows, Dataset A contains 834 images with 36 classes (A-Z and 0-9) that were extracted from the Chars74K dataset. These 834 images are clear without any noise being added to them. The width of the characters is almost the same. But in a real-life scenario, the images may be thicker and sometimes very thin. The sample dataset is shown in fig 5 (a). In dataset B we added some more images manually and it contained 998 images with the same 36 classes. Totally 164 more images with the existing images were added. These newly added images were manually cropped from using the snipping tool from the Chars74K dataset and even reduced the pixels for some images in all 36 classes. Some sample images are shown in fig  5 (b). This is the third type of dataset that we used to train the model. Dataset C had a total of 1229 images. A total of 231 images were added. All the 231 images added in this C dataset were collected manually from the internet. Some images included dark and light backgrounds. The thickness of the characters was different and some images were cropped manually. Some samples of images are shown in fig 5 (c).

Experimental Setup
The experiment was carried out in Google Colab, a cloud server. In this cloud server, we used Tesla P100 as a GPU with 16GB RAM for enhancing the running time. The training tools used in this case were Python, Keras, Tensorflow etc.

Experimental Analysis
Several experiments with different batch sizes, and with different datasets as shown in Table 2 were carried out to find a better model that would be more generalized in all cases. Table 2 shows the variations of the predicted output with its accuracy when using a different dataset for the same number of iterations. To validate the accuracy of the predicted output we have used the kappa score. Kappa score is one of the evaluation matrices for measuring accuracy. We trained the model by changing several parameters to find out the best point where our model would perform well. We have mentioned some of the hyperparameter changes and their effect on model performance is shown in Table 2. When dataset A was used to train the model we got and maximum training accuracy of 98.15, still the model didn't perform well in predicting the letter and got a very low accuracy by predicting 3 characters incorrectly. The main reason for the incorrect classification is the variation in the dataset. Dataset A didn't have much variation and hence the model was unable to classify characters with noise or the one which was blurred. In the case of the C dataset, there were so many variations of the image like blurring and cropped images so the model could able to classify them with higher accuracy. The training accuracy and validation accuracy of the model using dataset C is plotted in Fig. 6 and the training and validation loss is shown in Fig. 7.

Model Performance in Each Stage
As discussed above the model includes three stages and the testing of all the three stages have been performed with images captured from real-time which had a total of 20 images and several videos. The performance of the model has been analyzed with eth real-time data in all stages to find out where the model lags. The result in all three stages are as follows:

Detection of License Plate
The model has been tested on several real-time vehicles by passing an image and video taken in realtime to check how well the model can detect the license plate and in which case it is failing to detect the license plate. Real-time images as shown in Fig. 8 is passed to the model in this stage to detect the license plate. When tried with more than 20 real-time images in different conditions, the model achieved 91.1% accuracy in detecting the license plate. The main reason for the reduction in accuracy was the additional edges available in most of the images.

Segmentation of Characters
From the License plate detected the character available in the license plate are segmented (or boxed). With all the 20 real-time images available, we test the segmentation accuracy and got a satisfactory accuracy of 96.67% which is efficient. The problem in the segmentation of the characters occurred due to the extra noise present in the license plate.

Recognition of Segmented Characters
All the characters segmented from the license plate were tested by the trained CNN model to identify the prediction accuracy. All the characters segmented from the license plate under different situations such as day and night images (from step 2 i.e. Segmentation of Characters) were tested to verify how well CNN can recognize these characters and it gave an accuracy of 98.8% accuracy. The model is unable to recognize the characters when blurred heavily. The performance of the proposed ALPDR model in each stage is visualized in Fig. 9.

Performance Evaluation
This section evaluates the performance of the proposed ALPDR model with other state-of-the-art methods. Table 3 compares the results of the proposed model and other models. It also describes the methodology through which the proposed model can get better accuracy. From the evaluation, it is evident that the proposed ALPDR model is efficient than other state-of-the-art approaches.

Discussion
In this section, we discuss the highlights and limitations of the proposed ALPDR model. Based on the experimental analysis and the performance evaluation it is evident that the proposed ALPDR model is efficient in detecting, segmenting, and recognizing license plates of cars in different conditions such as day, night, varying distance, etc. The ALPDR model is trained with the public dataset Chars74k which contains www.aetic.theiaer.org images of different characters. We have generated three different types of the dataset from the Chars74k dataset to analyze the efficiency of the model. We observed that the dataset which consists of varying images such as blurred, and cropped images yield better accuracy in terms of detection and recognition. Then the proposed ALPDR model is tested in the real-time scenario, the model has recognized the characters in the license plate with 98.8 accuracy which is superior to many state-of-the-art approaches. The details of the performance evaluation are shown in Table 3.
Since the focus of our proposed model is to build a lightweight ALDPR model that recognize the licence plate efficiently, there are few limitations included in our study. The proposed model uses an edge detection technique to detect the license plate rather than using object detection. Due to the limitation of the dataset we used for this study, we adopted an edge detection technique to detect the license plate which proves to be efficient for our model. Another challenge we faced with Indian vehicles is that even though there is a clear policy on the size and colors of the vehicle license plate, still there are no common standards followed for the font and size of the license plate. Hence, our model sometimes produced incorrect character recognitions when the characters and numbers are written in different fonts without complying with the policies. Further, the results we achieved are for standing vehicles, the model is not tested with moving vehicles. In the future, we would like to extend our work for moving vehicles by capturing the video frames. Also, we would like to make the deep learning model deeper with appropriate regularization to improve the detection and recognition accuracy.

Conclusion
Aiming at the practicability of License Plate Detection and Recognition in the parking area, tolls, and other places where a huge number of vehicles enter and exit, an efficient Automatic License Plate Detection and Recognition (ALPDR) model is designed. In this research paper, a three-step process model to detect the license plate and to recognize the characters from the license plate is proposed. The proposed model uses convolutional features to recognize the character and achieved an accuracy of 98.8% in recognizing the characters. In the detection of the license plate, the model achieved an accuracy of 91.1%. The proposed ALPDR model has experimented with real-time car images with different license plate formats and the performance evaluation proves that the proposed model is efficient compared to state-of-the-art methods. In the future work, the proposed model will be integrated with IoT-based systems to efficiently detect the license plate of the moving vehicles also would like to add some super-resolution techniques to improve the license plate detection in different weather conditions.