reproducibilityindex.ai

Graph-Based Transformer with Cross-Candidate Verification for Semantic Parsing

Authors: Bo Shao, Yeyun Gong, Weizhen Qi, Guihong Cao, Jianshu Ji, Xiaola Lin8807-8814

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on 3 semantic parsing benchmarks, ATIS, JOBS and Task Oriented semantic Parsing dataset (TOP). Experiments show that our graph-based reranking model achieves results comparable to state-of-the-art models on the ATIS and JOBS datasets. And on the TOP dataset, our model achieves a new state-of-the-art result.
Researcher Affiliation	Collaboration	Bo Shao, 1,2 Yeyun Gong,2 Weizhen Qi,2,3 Guihong Cao,4 Jianshu Ji,4 Xiaola Lin1 1Sun Yat-sen University 2Microsoft Research Asia 3University of Science and Technology of China 4Microsoft AI and Research, Redmond WA, USA
Pseudocode	No	The paper describes the model architecture and components in prose and with diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the described methodology (e.g., no repository link, no explicit statement of code release).
Open Datasets	Yes	We conduct our experiment on 3 semantic parsing datasets JOBS, ATIS and TOP. JOBS is a dataset containing 640 queries annotated from a database of job listings. Questions are paired with Prologstyle queries. We follow the training and test split in (Zettlemoyer and Collins 2012). ATIS is a dataset containing 5410 queries to a ﬂight booking system (Hemphill, Godfrey, and Doddington 1990). The data has been split into 4480 training instances, 480 validation instances, and 450 test instances. TOP1 is a large scale semantic parsing dataset (Gupta et al. 2018), containing 44,783 annotation question and parsing tree pairs, which are split into 31,279 for training, 4,462 for validation and 9,042 for test. (Gupta et al. 2018) is cited in the references.
Dataset Splits	Yes	ATIS is a dataset containing 5410 queries to a ﬂight booking system (Hemphill, Godfrey, and Doddington 1990). The data has been split into 4480 training instances, 480 validation instances, and 450 test instances. TOP1 is a large scale semantic parsing dataset (Gupta et al. 2018), containing 44,783 annotation question and parsing tree pairs, which are split into 31,279 for training, 4,462 for validation and 9,042 for test.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software components like BERTBASE and Glove word embeddings, but does not provide specific version numbers for these or any other ancillary software, libraries, or frameworks used for the experiments.
Experiment Setup	Yes	In our generation model, Glove word embeddings (Pennington, Socher, and Manning 2014) are used as our pretrained word embeddings.Input sentences are lower-cased. The beam size of the model is 10. We set the dropout rate to 0.5. The dimension of all hidden vectors and word embedding is set to 300. Word vocabulary is not shared between encoder and decoder. Parameters are randomly initialized from a uniform distribution (-0.01, 0.01). We use Adagrad (Duchi, Hazan, and Singer 2011) as optimizer during training, and an early stop strategy is used to decide the training epoch. In our GTCV ranking model, we load the pre-trained model of BERTBASE with small parameters to reduce the training time. The number of Transformer blocks is 12, and the dimension of all hidden states is 768 in our model. The batch size of the model is 32. Dropout is set to 0.1. We set the block number n of our graph-based transformer to 3. The score weight α in our model is set to be 0.1. We use all the 10 candidates from beam search to train and evaluate our ranking model. We optimize our model by an Adam optimizer with an initial learning rate of 3e 4, β1 = 0.9, β2 = 0.999 and γ= 10 9. Gradient accumulation is used in our training, and accumulate step is set to 12.