GTC: Guided Training of CTC towards Efficient and Accurate Scene Text Recognition

Authors: Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin11005-11012

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attention-based methods.
Researcher Affiliation Collaboration 1Nanyang Technological University 2Sense Time Group Ltd.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes There are three public synthetic datasets, namely Synth90K (Jaderberg et al. 2015), Synth Text (Gupta, Vedaldi, and Zisserman 2016) and Synth Add (Li et al. 2019)...IIIT5K-Words (IIIT5K) (Mishra, Alahari, and Jawahar 2012)...Street View Text (SVT) (Wang, Babenko, and Belongie 2011)...ICDAR 2003 (IC03) (Lucas et al. 2003)...ICDAR 2013 (IC13) (Karatzas et al. 2013)...ICDAR 2015 (IC15) (Karatzas et al. 2015)...SVT Perspective (SVT-P) (Quy Phan et al. 2013)...CUTE80 (Risnumawan et al. 2014)...COCO-Text (COCO) (Veit et al. 2016).
Dataset Splits No The paper mentions training and testing data but does not explicitly provide details about a validation dataset split.
Hardware Specification Yes We implement our proposed network structure with Py Torch and conduct all experiments on NVIDIA Tesla V100 GPUs with 16GB memory.
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies needed to replicate the experiment.
Experiment Setup Yes We use a batch size of 32 on each GPU, with 32 GPUs in total. ADAM optimizer is chosen for training, with the initial learning rate set to 10-3 and a decay rate of 0.1 every 30000 iterations.