Data-to-Text Generation with Content Selection and Planning
Authors: Ratish Puduppully, Li Dong, Mirella Lapata6908-6915
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Automatic and human-based evaluation experiments show that our model outperforms strong baselines improving the state-of-the-art on the recently released ROTOWIRE dataset. |
| Researcher Affiliation | Academia | Ratish Puduppully, Li Dong, Mirella Lapata Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB r.puduppully@sms.ed.ac.uk, li.dong@ed.ac.uk, mlap@inf.ed.ac.uk |
| Pseudocode | No | The paper describes the model architecture and mathematical formulations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/ratishsp/data2text-plan-py. |
| Open Datasets | Yes | We trained and evaluated our model on ROTOWIRE (Wiseman et al. 2017), a dataset of basketball game summaries, paired with corresponding boxand line-score tables. |
| Dataset Splits | Yes | We followed the data partitions introduced in Wiseman et al. (2017): we trained on 3,398 summaries, tested on 728, and used 727 for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Open NMT-py (Klein et al. 2017)' as the implementation framework but does not specify a version number for it or other ancillary software components. |
| Experiment Setup | Yes | We validated model hyperparameters on the development set. We did not tune the dimensions of word embeddings and LSTM hidden layers; we used the same value of 600 reported in Wiseman et al. (2017). We used one-layer pointer networks during content planning, and two-layer LSTMs during text generation. Input feeding (Luong et al. 2015) was employed for the text decoder. We applied dropout (Zaremba et al. 2014) at a rate of 0.3. Models were trained for 25 epochs with the Adagrad optimizer (Duchi et al. 2011); the initial learning rate was 0.15, learning rate decay was selected from {0.5, 0.97}, and batch size was 5. For text decoding, we made use of BPTT (Mikolov et al. 2010) and set the truncation size to 100. We set the beam size to 5 during inference. |