reproducibilityindex.ai

Towards Accurate Video Text Spotting with Text-wise Semantic Reasoning

Authors: Xinyan Zu, Haiyang Yu, Bin Li, Xiangyang Xue

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on multiple VTS benchmarks demonstrate that the proposed VLSpotter outperforms the existing state-of-the-art methods in end-to-end video text spotting.Extensive experiments of text spotting, tracking and detection are conducted on three VTS benchmarks, ICDAR2013 Video [Karatzas et al., 2013], ICDAR2015 Video [Karatzas et al., 2015], and BOVText [Wu et al., 2021], to evaluate the effectiveness of the proposed VLSpotter.
Researcher Affiliation	Academia	Xinyan Zu, Haiyang Yu, Bin Li , Xiangyang Xue Shanghai Key Laboratory of Intelligent Information Processing School of Computer Science, Fudan University {xyzu20, hyyu20, libin, xyxue}@fudan.edu.cn
Pseudocode	No	The paper describes its methods in prose and with a system diagram (Figure 2) but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code of VLSpotter is available at Git Hub1. 1https://github.com/Fudan VI/Fudan OCR/VLSpotter
Open Datasets	Yes	We conduct experiments on three commonly-used datasets: ICDAR2013 Video [Karatzas et al., 2013], ICDAR2015 Video [Karatzas et al., 2015], and BOVText [Wu et al., 2021].
Dataset Splits	Yes	ICDAR2013 Video ... 13 videos are used for training and 15 videos for testing. ... ICDAR2015 Video ... 25 videos for training and 24 videos for testing. ... BOVText ... 1,328,575 frames from 1,541 videos are used for training and 429,023 frames from 480 videos are used for testing.
Hardware Specification	Yes	All experiments are conducted on a single RTX 3090 GPU with 24GB memory.
Software Dependencies	No	The paper states 'The proposed VLSpotter is implemented with Py Torch.' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	Based on the empirical experiments, we set the hyper-parameters ε, σ, ϕ, λ1, λ2 to 1, 0.5, 3, 1, 1, respectively. We use the Adadelta optimizer with an initial learning rate 0.1, which further shrinks every 200 epochs.