reproducibilityindex.ai

Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement

Authors: Hang Guo, Tao Dai, Guanghao Meng, Shu-Tao Xia

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on Text Zoom and four scene text recognition benchmarks demonstrate the superiority of our method over other state-of-the-art methods.
Researcher Affiliation	Collaboration	Hang Guo1 , Tao Dai2, , Guanghao Meng1,3 , Shu-Tao Xia1,3 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2College of Computer Science and Software Engineering, Shenzhen University 3Peng Cheng Laboratory, Shenzhen, China
Pseudocode	No	The paper describes its methods in detail using natural language and diagrams (Figure 2, 3) but does not include any formal pseudocode blocks or algorithms.
Open Source Code	Yes	Code is available at https://github.com/csguoh/LEMMA.
Open Datasets	Yes	Scene Text Image Super-resolution Dataset Text Zoom [Wang et al., 2020] is widely used in STISR works. This dataset is derived from two single image superresolution datasets, Real SR [Cai et al., 2019] and SR-RAW [Zhang et al., 2019]. The images are captured by digital cameras in real-world scenes. In total, Text Zoom contains 17367 LR-HR pairs for training and 4373 pairs for testing.
Dataset Splits	No	The paper explicitly states the size of the training and testing sets ('17367 LR-HR pairs for training and 4373 pairs for testing') but does not specify a separate validation split or its size.
Hardware Specification	No	The paper specifies training details like batch size, epochs, and learning rates, but it does not provide any specific information about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'ABINet [Fang et al., 2021] as the attention-based text recognizer' and 'Adam [Kingma and Ba, 2014] for optimization', but it does not specify version numbers for these software components or any other libraries/frameworks used.
Experiment Setup	Yes	We train our model with batch size 64 for 500 epochs using Adam [Kingma and Ba, 2014] for optimization. The learning rate is set to 1e-3 for the super-resolution and 1e-4 for fine-tuning ABINet, both are decayed with a factor of 0.5 after 400 epochs. We refer to the hyperparameters on Ltxt given in [Chen et al., 2021], namely λ1 = 10, λ2 = 0.0005. For the other hyperparameters, we use α1 = 0.5, α2 = 0.01, see supplementary material for details.