reproducibilityindex.ai

Gated Recurrent Convolution Neural Network for OCR

Authors: Jianfeng Wang, Xiaolin Hu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that the proposed model outperforms existing methods on several benchmark datasets including the IIIT-5K, Street View Text (SVT) and ICDAR. ... The proposed method outperforms most existing models for both constrained and unconstrained text recognition.
Researcher Affiliation	Academia	Jianfeng Wang Beijing University of Posts and Telecommunications Beijing 100876, China jianfengwang1991@gmail.com Xiaolin Hu Tsinghua National Laboratory for Information Science and Technology (TNList) Department of Computer Science and Technology Center for Brain-Inspired Computing Research (CBICR) Tsinghua University, Beijing 100084, China xlhu@tsinghua.edu.cn
Pseudocode	No	No pseudocode or algorithm blocks found.
Open Source Code	Yes	The code and pre-trained model will be released at https://github.com/ Jianfeng1991/GRCNN-for-OCR.
Open Datasets	Yes	ICDAR2003: ICDAR2003 [24] contains 251 scene images and there are 860 cropped images of the words. ... IIIT5K: This dataset has 3000 cropped testing word images and 2000 cropped training images collected from the Internet [31]. ... Street View Text (SVT): This dataset has 647 cropped word images from Google Street View [36]. ... Synth90k: This dataset contains around 7 million training images, 800k validation images and 900k test images [15].
Dataset Splits	Yes	The validation set of Synth90k is used for model selection.
Hardware Specification	No	No specific hardware details (GPU/CPU models, processors, memory) mentioned for experiments.
Software Dependencies	No	The ADADELTA method [41] is used for training with the parameter ρ=0.9.
Experiment Setup	Yes	The input is a gray-scale image which is resized to 100 32. Before input to the network, the pixel values are rescaled to the range (-1, 1). The ﬁnal output of the feature extractor is a feature sequence of 26 frames. The recurrent layer is a bidirectional LSTM with 512 units without dropout. The ADADELTA method [41] is used for training with the parameter ρ=0.9. The batch size is set to 192 and training is stopped after 300k iterations.