reproducibilityindex.ai

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 c TDa R Historical and i FLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance.
Researcher Affiliation	Collaboration	Chunxia Qin1 , Zhenrong Zhang1 , Pengfei Hu1 , Chenyu Liu2 , Jiefeng Ma1 and Jun Du1 1University of Science and Technology of China 2i FLYTEK Research
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/Chunchunwumu/SEMv3.
Open Datasets	Yes	We evaluate the performance of our method on several public datasets. These datasets encompass a wide range of challenging scenarios related to table structure recognition. ICDAR-2019 c TDa R Historical [Gao et al., 2019] dataset contains 600 training samples and 150 testing samples from archival historical documents. WTW [Long et al., 2021] contains 14581 wired table images collected from real business scenarios. i FLYTAB [Zhang et al., 2024] contains 12,103 training samples and 5,188 testing samples.
Dataset Splits	No	The paper mentions training and testing samples for datasets like ICDAR-2019 c TDa R Historical and i FLYTAB, but does not explicitly provide details about a validation dataset split.
Hardware Specification	Yes	All experiments are implemented in Pytorch v1.7.1 and conducted on 4 Nvidia Tesla V100 GPUs with 24GB RAM memory.
Software Dependencies	Yes	All experiments are implemented in Pytorch v1.7.1 and conducted on 4 Nvidia Tesla V100 GPUs with 24GB RAM memory.
Experiment Setup	Yes	Models are trained end-to-end for 100 epochs. We use Adam [Kingma and Ba, 2014] as the optimizer. The initial learning rate is 1 10 4, and the learning rate is adjusted to 1 10 6 according to the cosine annealing strategy [Loshchilov and Hutter, 2016]. All experiments are implemented in Pytorch v1.7.1 and conducted on 4 Nvidia Tesla V100 GPUs with 24GB RAM memory. During the training process, the ground truth grid boxes coordinates are used when extracting grid represent feature using Ro IAlign. ... And the feature F channel number C is 256, grid feature channel number Cg is 512. The sampling step size t is 32.