Better AMR-To-Text Generation with Graph Structure Reconstruction

Authors: Tianming Wang, Xiaojun Wan, Shaowei Yao

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two benchmark datasets show that our proposed model improves considerably over strong baselines and achieves new state-of-the-art. ... 3 Experiment ... 3.3 Comparison Results ... 3.4 Ablation Study
Researcher Affiliation Academia Tianming Wang , Xiaojun Wan and Shaowei Yao Wangxuan Institute of Computer Technology, Peking University The MOE Key Laboratory of Computational Linguistics, Peking University {wangtm, wanxiaojun, yaosw}@pku.edu.cn
Pseudocode No The paper describes algorithmic steps and equations within the text, but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block or figure.
Open Source Code Yes 1The code is available at https://github.com/sodawater/ graph-reconstruction-amr2text.
Open Datasets Yes Two standard English AMR corpora (LDC2015E86 and LDC2017T10) are used as our evaluation datasets.
Dataset Splits Yes The LDC2015E86 dataset contains 16833 training instances, 1368 development instances, and 1371 test instances. The LDC2017T10 contains 36521 training instances and the same instances for the development and test as LDC2015E86.
Hardware Specification No The paper does not mention any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Glove vectors [Pennington et al., 2014]' and 'Adam optimizer [Kingma and Ba, 2015]' with its parameters. However, it does not provide specific version numbers for software libraries or frameworks used (e.g., 'PyTorch 1.x' or 'TensorFlow 2.x').
Experiment Setup Yes We set the model parameters based on preliminary experiments on the development set. dmodel is set to 512. The numbers L1, L2 of layers of the encoder and decoder are both set to 6. The head number K is set to 2. The batch size is set to 64. λn is set to 0.1, λl is set to 0.4 and λd is set to 0.1. We share the vocabulary of the encoder and decoder, and use Glove vectors [Pennington et al., 2014] to initialize the word embeddings and demb is set to 300. We apply dropout and use a rate of 0.2. Label smoothing is employed and the rate is set to 0.1. We use the Adam optimizer [Kingma and Ba, 2015] with β1 = 0.9, β2 = 0.98 and ϵ = 10 9. The same learning rate schedule of Vaswani et al. [2017] is adopted and the maximum learning rate is set to 0.0005. During inference, beam search with size 5 is used.