reproducibilityindex.ai

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Authors: Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, bailin wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, richard socher, Caiming Xiong

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate on four popular semantic parsing benchmarks in both fully supervised and weakly supervised settings. GRAPPA consistently achieves new state-of-the-art results on all of them, signiﬁcantly outperforming all previously reported results.
Researcher Affiliation	Collaboration	Salesforce Research, Yale University, University of Edinburgh {tao.yu, yichern.tan,dragomir.radev}@yale.edu, bailin.wang@ed.ac.uk {wu.jason,x.yang,rsocher,cxiong}@salesforce.com, victorialin@fb.com
Pseudocode	No	The paper describes its methods in detail through text and diagrams (Figure 1), but does not contain a formal pseudocode or algorithm block.
Open Source Code	Yes	The pre-trained embeddings can be downloaded at https://huggingface.co/Salesforce/grappa_large_jnt.
Open Datasets	Yes	We use WIKITABLES (Bhagavatula et al., 2015), which contains 1.6 million high-quality relational Wikipedia tables. We collected seven high quality datasets for textual-tabular data understanding (Table 8 in the Appendix), all of them contain Wikipedia tables or databases and the corresponding natural language utterances written by humans. (SPIDER Yu et al. (2018b), WIKISQL Zhong et al. (2017), WIKITABLEQUESTIONS Pasupat & Liang (2015) are listed in Table 2 with citations)
Dataset Splits	Yes	We conduct experiments on four cross-domain table semantic parsing tasks...The data statistics and examples on each task are shown in Table 2... We experiment with two different settings... Table 3: Performance on SPIDER. Models Dev. Test... Table 4: Performance on fully-sup. WIKISQL. Models Dev. Test... Table 5: Performance on WIKITABLEQUESTIONS. Models Dev. Test... Table 6: Performance on weakly-sup. WIKISQL. Models Dev. Test...
Hardware Specification	Yes	We ﬁne-tune GRAPPA for 300k steps on eight 16GB Nvidia V100 GPUs.
Software Dependencies	No	The paper mentions modifying the code of RoBERTa implemented by Wolf et al. (2019) but does not provide specific version numbers for software dependencies like PyTorch, Python, or other libraries.
Experiment Setup	Yes	For ﬁne-tuning Ro BERTa, we modify the code of Ro BERTa implemented by Wolf et al. (2019) and follow the hyperparameters for ﬁne-tuning Ro BERTa on RACE tasks and use batch size 24, learning rate 1e-5, and the Adam optimizer Kingma & Ba (2014). ... We follow the default hyperparameters from Devlin et al. (2019) with a 15% masking probability.