reproducibilityindex.ai

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Authors: Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate TAPEX on four benchmark datasets. Experimental results demonstrate that TAPEX outperforms previous table pre-training approaches by a large margin and achieves new state-of-the-art results on all of them.
Researcher Affiliation	Collaboration	Beihang University, Xi an Jiaotong University, Microsoft Research Asia, Microsoft Azure AI
Pseudocode	No	The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code can be found at https://github.com/microsoft/Table-Pretraining.
Open Datasets	Yes	We evaluate the performance of our approach on weakly-supervised Wiki SQL (WIKISQL-WEAK) (Zhong et al., 2017), WIKITABLEQUESTIONS (Pasupat & Liang, 2015), SQA (Iyyer et al., 2017), and TABFACT (Chen et al., 2020).
Dataset Splits	Yes	The best pre-training checkpoint is selected based on the loss on the validation set. For both dev and test sets of all datasets, we report the median performance of our approach for five random runs.
Hardware Specification	Yes	It takes about 36 hours on 8 Tesla V100 GPUs to finish the pre-training.
Software Dependencies	No	We implement our approach based on fairseq (Ott et al., 2019).
Experiment Setup	Yes	Our pre-training procedure runs up to 50,000 steps with a batch size of 256. ... For all downstream datasets, the fine-tuning procedure runs up to 20,000 steps with a batch size of 128. For both pretraining and fine-tuning, the learning rate is 3 10 5.