reproducibilityindex.ai

GTC: Guided Training of CTC towards Efficient and Accurate Scene Text Recognition

Authors: Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin11005-11012

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attention-based methods.
Researcher Affiliation	Collaboration	1Nanyang Technological University 2Sense Time Group Ltd.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	There are three public synthetic datasets, namely Synth90K (Jaderberg et al. 2015), Synth Text (Gupta, Vedaldi, and Zisserman 2016) and Synth Add (Li et al. 2019)...IIIT5K-Words (IIIT5K) (Mishra, Alahari, and Jawahar 2012)...Street View Text (SVT) (Wang, Babenko, and Belongie 2011)...ICDAR 2003 (IC03) (Lucas et al. 2003)...ICDAR 2013 (IC13) (Karatzas et al. 2013)...ICDAR 2015 (IC15) (Karatzas et al. 2015)...SVT Perspective (SVT-P) (Quy Phan et al. 2013)...CUTE80 (Risnumawan et al. 2014)...COCO-Text (COCO) (Veit et al. 2016).
Dataset Splits	No	The paper mentions training and testing data but does not explicitly provide details about a validation dataset split.
Hardware Specification	Yes	We implement our proposed network structure with Py Torch and conduct all experiments on NVIDIA Tesla V100 GPUs with 16GB memory.
Software Dependencies	No	The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies needed to replicate the experiment.
Experiment Setup	Yes	We use a batch size of 32 on each GPU, with 32 GPUs in total. ADAM optimizer is chosen for training, with the initial learning rate set to 10-3 and a decay rate of 0.1 every 30000 iterations.