Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning to Read Irregular Text with Attention Mechanisms
Authors: Xiao Yang, Dafang He, Zihan Zhou, Daniel Kifer, C. Lee Giles
IJCAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text. 5 Experiments We ο¬rst conduct ablation experiments to carefully investigate the effectiveness of the model components. After that, we evaluate our model on a number of standard benchmark datasets for scene text recognition, and report word prediction accuracy. |
| Researcher Affiliation | Academia | Xiao Yang, Dafang He, Zihan Zhou, Daniel Kifer, C. Lee Giles The Pennsylvania State University, University Park, PA 16802, USA EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and components but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions that a generated dataset 'will be made public', but there is no explicit statement about releasing the source code for the methodology. |
| Open Datasets | Yes | SVT-Perspective [Quy Phan et al., 2013], CUTE80 [Risnumawan et al., 2014], ICDAR03 [Lucas et al., 2003], SVT [Wang et al., 2011], III5K [Mishra et al., 2012], Following a similar method, we generate a large-scale synthetic dataset containing perspectively distorted and curved text... Such dataset will be made public to support future research for irregular text reading. |
| Dataset Splits | No | The paper mentions 'validation set' in Figure 5 but does not provide specific details on the dataset split percentages or sample counts for training, validation, or testing. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'Ada Delta [Zeiler, 2012]' but does not provide specific version numbers for software dependencies or other libraries used in the implementation. |
| Experiment Setup | Yes | The hyper parameters Ξ»1 and Ξ»2 in our training objective L are set to 10 at the beginning and decrease throughout training. To approximate WD, we project the 2D attention weights along 4 directions: 0 (horizontal), 90 (vertical), 45 and -45 . Beam Search with a window size of 3 is used for decoding in r. The proposed model is trained in an end-to-end manner using stochastic gradient decent. We adopt Ada Delta [Zeiler, 2012] to automatically adjust the learning rate. |