Data-to-Text Generation with Content Selection and Planning

Authors: Ratish Puduppully, Li Dong, Mirella Lapata6908-6915

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Automatic and human-based evaluation experiments show that our model outperforms strong baselines improving the state-of-the-art on the recently released ROTOWIRE dataset.
Researcher Affiliation Academia Ratish Puduppully, Li Dong, Mirella Lapata Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB r.puduppully@sms.ed.ac.uk, li.dong@ed.ac.uk, mlap@inf.ed.ac.uk
Pseudocode No The paper describes the model architecture and mathematical formulations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our code is publicly available at https://github.com/ratishsp/data2text-plan-py.
Open Datasets Yes We trained and evaluated our model on ROTOWIRE (Wiseman et al. 2017), a dataset of basketball game summaries, paired with corresponding boxand line-score tables.
Dataset Splits Yes We followed the data partitions introduced in Wiseman et al. (2017): we trained on 3,398 summaries, tested on 728, and used 727 for validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Open NMT-py (Klein et al. 2017)' as the implementation framework but does not specify a version number for it or other ancillary software components.
Experiment Setup Yes We validated model hyperparameters on the development set. We did not tune the dimensions of word embeddings and LSTM hidden layers; we used the same value of 600 reported in Wiseman et al. (2017). We used one-layer pointer networks during content planning, and two-layer LSTMs during text generation. Input feeding (Luong et al. 2015) was employed for the text decoder. We applied dropout (Zaremba et al. 2014) at a rate of 0.3. Models were trained for 25 epochs with the Adagrad optimizer (Duchi et al. 2011); the initial learning rate was 0.15, learning rate decay was selected from {0.5, 0.97}, and batch size was 5. For text decoding, we made use of BPTT (Mikolov et al. 2010) and set the truncation size to 100. We set the beam size to 5 during inference.