LORE: Logical Location Regression Network for Table Structure Recognition
Authors: Hangdi Xing, Feiyu Gao, Rujiao Long, Jiajun Bu, Qi Zheng, Liangcheng Li, Cong Yao, Zhi Yu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard benchmarks demonstrate that LORE consistently outperforms prior arts. |
| Researcher Affiliation | Collaboration | Hangdi Xing*1, Feiyu Gao*3, Rujiao Long3, Jiajun Bu1, Qi Zheng3, Liangcheng Li1, Cong Yao3, Zhi Yu 2 1Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University 2Zhejiang Provincial Key Laboratory of Service Robot, School of Software Technology, Zhejiang University 3DAMO Academy, Alibaba Group, Hangzhou, China |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Code is available at https:// github.com/Alibaba Research/Advanced Literate Machinery/ tree/main/Document Understanding/LORE-TSR. |
| Open Datasets | Yes | We evaluate LORE on a wide range of benchmarks, including tables in digital-born documents, i.e., ICDAR-2013 (G obel et al. 2013), Sci TSR-comp (Chi et al. 2019), Pub Tab Net (Zhong, Shafiei Bavani, and Jimeno Yepes 2020), Table Bank (Li et al. 2020) and Table Graph-24K (Xue et al. 2021), as well as tables from scanned documents and photos, i.e., ICDAR-2019 (Gao et al. 2019) and WTW (Long et al. 2021). |
| Dataset Splits | Yes | It should be noted that ICDAR-2013 provides no training data, so we extend it to the partial version for cross validation following previous works (Raja, Mondal, and Jawahar 2020; Liu et al. 2022, 2021). |
| Hardware Specification | Yes | All the experiments are performed on the platform with 4 NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using a DLA-34 backbone, but does not specify software versions for libraries like PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | The model is trained for 100 epochs, and the initial learning rate is chosen as 1 10 4, decaying to 1 10 5 and 1 10 6 at the 70th and 90th epochs for all benchmarks. ... We use the DLA-34 (Yu et al. 2018) backbone, the output stride R = 4 and the number of channels d = 256. ... The number of attention layers is set to 3 for both the base and the stacking regressors. |