reproducibilityindex.ai

Scene Text Detection in Video by Learning Locally and Globally

Authors: Shu Tian, Wei-Yi Pei, Ze-Yu Zuo, Xu-Cheng Yin

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Moreover, our proposed techniques are extensively evaluated on several public scene video text databases, and are much better than the state-of-the-art methods. Experimental results show that our approach signiﬁcantly outperforms the state-of-the-art methods on all datasets.
Researcher Affiliation	Academia	Shu Tian , Wei-Yi Pei , Ze-Yu Zuo, and Xu-Cheng Yin Department of Computer Science, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China Corresponding author: xuchengyin@ustb.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing open-source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Moreover, our proposed system is veriﬁed on a variety of public scene text video databases, i.e., the Minetto [Minetto et al., 2011] and ICDAR 15 datasets [Karatzas et al., 2015a]. To evaluate our tracking based detection approach, a public video dataset with a variety of scene videos is ﬁrst used in our experiments 2. (Footnote 2: http://www.liv.ic.unicamp.br/ minetto/datasets/text/VIDEOS/) Moreover, we also perform experiments of our method on the recent challenging dataset of ICDAR 2015 Robust Reading Competition Challenge 3 (Text Detection and Recognition in Scene Videos) 3. (Footnote 3: http://rrc.cvc.uab.es/?ch=3&com=introduction)
Dataset Splits	No	The MSRA-TD500 dataset is a multi-orientation dataset with 500 images where 300 images is for training and the rest is for testing. This dataset includes a training set of 25 videos (13450 frames in total) and a test set of 24 videos (14374 frames in total). While training and test splits are mentioned, a validation set is not explicitly specified or quantified, nor are full split details provided for all datasets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using a "CNN-based word recognition technique" but does not specify any software libraries, frameworks, or solvers with version numbers.
Experiment Setup	No	The paper describes the overall system and some algorithmic details, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific training configurations.