reproducibilityindex.ai

Feature Enhancement Network: A Refined Scene Text Detector

Authors: Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on ICDAR 2011 and 2013 robust text detection benchmarks demonstrate that our method can achieve state-of-the-art results, outperforming all reported methods in terms of F-measure.
Researcher Affiliation	Academia	Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo School of Electronic and Information Engineering, South China University of Technology zsscut90@gmail.com, liu.yuliang@mail.scut.edu.cn, {lianwen.jin, canjie.luo}@gmail.com
Pseudocode	No	The paper includes architectural diagrams (e.g., Figure 1) and describes algorithmic steps in prose, but it does not contain formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	To prove the effectiveness of our approach, we have tested it on two challenging benchmark datasets, i.e. ICDAR 2011 (Shahab, Shafait, and Dengel 2011) and ICDAR 2013 (Karatzas et al. 2013) robust text detection datasets. ... we also gather about 4000 real scene images for training our network.
Dataset Splits	No	The paper mentions training and testing on ICDAR 2011 and 2013 datasets, but does not explicitly describe a validation set or specific train/validation/test splits with percentages or sample counts.
Hardware Specification	Yes	All the experiments are carried out on a PC with one Titan X GPU.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with their versions).
Experiment Setup	Yes	During the training procedure, we choose the similar multitask loss functions for both text region proposal and text detection reﬁnement stages, i.e. L (s, c, b, g) = 1 / N (Lcls(s, c) + λ Lloc(b, g)) where N is the amount of anchors or proposals that match ground-truth boxes, and λ (λ = 1) is a balance factor which weighs the importance between two losses... Basically, the batchsize of input images in R-FCN (Dai et al. 2016) framework is only one... Our approach and the original R-FCN are both trained and tested with short side 720 except the multi-scale test