SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition
Authors: Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Fei Wu, Futai Zou3305-3314
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show the proposed transformation outperforms existing rectification networks and has comparable performance among the state-of-the-arts. |
| Researcher Affiliation | Collaboration | 1Shanghai Jiaotong University, China; 2Hikvision Research Institute, China; 3Zhejiang University, China |
| Pseudocode | No | The paper includes architectural diagrams and tables describing network configurations but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-sourcing of its code. |
| Open Datasets | Yes | Models are trained only on 2 public synthetic datasets MJSynth (MJ) (Jaderberg et al. 2014) and Synth Text (ST) (Gupta, Vedaldi, and Zisserman 2016) without any additional dataset or data augmentation. |
| Dataset Splits | No | The paper states models are trained on MJSynth and Synth Text datasets and evaluated on specific test sets, but does not explicitly provide details about training/validation/test splits or validation set sizes. |
| Hardware Specification | No | The paper describes the training process and model architecture but does not specify any hardware details like GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Ada Delta optimizer but does not specify any programming languages, libraries, or other software dependencies with version numbers required to reproduce the experiments. |
| Experiment Setup | Yes | All images are resized to 32 × 100 before entering the network. K = 6 is the default setting. The parameters are randomly initialized using He et al. s method (He et al. 2015) if not specified. Models are trained with the Ada Delta (Zeiler 2012) optimizer for 5 epochs with batch size = 64. The learning rate is set to 1.0 initially and decayed to 0.1 and 0.01 at 4-th and 5-th epoch, respectively. |