MSR: Multi-Scale Shape Regression for Scene Text Detection

Authors: Chuhui Xue, Shijian Lu, Wei Zhang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments over several public datasets show that the proposed MSR obtains superior detection performance for both curved and straight text lines of different lengths and orientations.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Nanyang Technological University 2School of Control Science and Engineering, Shandong University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes Synth Text [Gupta et al., 2016] contains more than 800,000 synthetic scene text images most of which are at word level with multi-oriented rectangular annotations. CTW1500 [Yuliang et al., 2017] consists of 1,000 training images and 500 test images... Total-Text [Ch ng and Chan, 2017] has 1,255 training images and 300 test images... MSRA-TD500 [Yao et al., 2012] consists of 300 training images and 200 test images. ICDAR2015 [Karatzas et al., 2015] has 1000 training images and 500 test images...
Dataset Splits Yes CTW1500 [Yuliang et al., 2017] consists of 1,000 training images and 500 test images... Total-Text [Ch ng and Chan, 2017] has 1,255 training images and 300 test images... MSRA-TD500 [Yao et al., 2012] consists of 300 training images and 200 test images. ICDAR2015 [Karatzas et al., 2015] has 1000 training images and 500 test images...
Hardware Specification Yes The proposed technique is implemented using Tensorflow on a regular GPU workstation with 2 Nvidia Geforce GTX 1080 Ti.
Software Dependencies No The paper mentions "Tensorflow" and "Adam optimizer" and "Res Net-50" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The network is optimized by Adam optimizer [Kingma and Ba, 2014] with a starting learning rate of 10 4. ... The network is pre-trained on the Synth Text, which is then fine-tuned by using the training images of each evaluated dataset with a batch size of 10. ... Parameters λ is the weight to balance the two losses which is empirically set at 1.0 in our implemented system.