Efficient License Plate Recognition via Holistic Position Attention
Authors: Yesheng Zhang, Zilei Wang, Jiafan Zhuang3438-3446
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on four public datasets, including AOLP, Media Lab, CCPD, and CLPD, well demonstrate the superiority of our method to previous state-of-the-art methods in both accuracy and speed. We experimentally evaluate the proposed method on the AOLP, Media Lab, CCPD, and CLPD datasets |
| Researcher Affiliation | Academia | Yesheng Zhang, Zilei Wang , Jiafan Zhuang University of Science and Technology of China {ysyzhang, jfzhuang}@mail.ustc.edu.cn, zlwang@ustc.edu.cn |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor are there any structured code-like blocks describing a procedure. |
| Open Source Code | No | The paper does not provide any explicit statement about making the source code for their proposed method publicly available, nor does it include any links to a code repository. |
| Open Datasets | Yes | Here four challenging public license plate datasets are used, i.e., AOLP (Hsu, Chen, and Chung 2012), Media Lab (Anagnostopoulos et al. 2008), CCPD (Xu et al. 2018), and CLPD (Zhang et al. 2020). |
| Dataset Splits | Yes | For AOLP: We follow the same training/test split as in (Li and Shen 2016; Zhuang et al. 2018), i.e., two subsets is used for training and the remaining one is for test. For Media Lab: we follow (Zhuang et al. 2018) to perform 4-fold cross validation. Specifically, we evenly divide the normal subset into four parts randomly. We then conduct four experiments, for each of which three parts together with the difficult subset are used for training and the remaining one is for test. For CCPD: the official training/test split is available (i.e., 100K for training and 100K for test). |
| Hardware Specification | Yes | During training, we use a mini-batch of the size 256 on 2 GTX1080Ti GPUs and the RMSProp optimizer (Ruder 2016). All evaluation experiments are performed on a server with one NVIDIA GTX 1080Ti GPU, Intel(R) Xeon(R) CPU E5-2640 v4, and about 500 GB memory. |
| Software Dependencies | No | The paper mentions software components like RMSProp optimizer, Bi Se Net, YOLOv4, and ResNet architectures, but it does not provide specific version numbers for these or any underlying software libraries or programming languages (e.g., PyTorch, Python, CUDA). |
| Experiment Setup | Yes | During training, we use a mini-batch of the size 256 on 2 GTX1080Ti GPUs and the RMSProp optimizer (Ruder 2016). The initial learning rate is set 2e 5 for each dataset. Throughout the experiments, the license plate images are resized to 50 160 according to (Zhuang et al. 2018), and the aspect ratio is close to that of real license plates. Here we adopt some common data augmentation strategies, such as noising, blurring, color jittering, rotation, projection, and cropping. We use Res Net-18, -34, -50, -101 (He et al. 2016) as the base network in backbone, and the pretrained models on Image Net (Deng et al. 2009) are used for initialization. |