PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Authors: Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi2782-2790

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed.
Researcher Affiliation Collaboration 1School of Artificial Intelligence, Xidian University, 2Department of Computer Vision Technology, Baidu Inc.
Pseudocode No The paper describes its procedures and models using natural language and diagrams, but it does not include formal pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets Yes The benchmark datasets used for the experiments in this paper are briefly introduced below. ICDAR 2015. The ICDAR 2015 dataset (Karatzas et al. 2015) is collected for the ICDAR 2015 Robust Reading Competition, with 1,000 natural images for training and 500 for testing. The text instances are annotated in word-level. Total-Text. The Total-Text (Ch ng and Chan 2017) is another curved text benchmark, which consists of 1,255 training images and 300 testing images with multiple orientations: horizontal, multi-Oriented, and curved.
Dataset Splits No The paper specifies training and testing sets (e.g., '1,000 natural images for training and 500 for testing' for ICDAR 2015) and mentions fine-tuning on combined datasets, but it does not explicitly define a separate validation dataset split or its purpose.
Hardware Specification Yes The experiments are performed on a workstation with the following configuration, CPU: Intel(R) Xeon(R) CPU E52620; GPU: NVIDIA TITAN Xp 4; RAM: 64GB.
Software Dependencies No The paper mentions various models and optimizers (e.g., Res Net-50, Adam optimizer) but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes The stem network is initialized with pre-trained weight on Image Net (Deng et al. 2009). The training process is mainly divided into the warming-up step, fine-tuning step, and training step of the GRM module. In the warmingup step, we apply Adam optimizer to train our model with learning rate 1e-3, and the learning rate decay factor is 0.94 on the Synth Text (Gupta, Vedaldi, and Zisserman 2016); In the fine-tuning step, the learning rate is re-initiated to 1e3... and The loss weights λ1, λ2, λ3, and λ4 are set to {1.0, 1.0, 1.0, 5.0} empirically. and the max length is set to 64. and the batch size is set to 1 on a single GPU.