Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Template Machine for Data-to-Text Generation

Authors: Rong Ye, Wenxian Shi, Hao Zhou, Zhongyu Wei, Lei Li

ICLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on datasets from a variety of different domains show that VTM is able to generate more diversely while keeping a good fluency and quality.
Researcher Affiliation Collaboration Fudan University EMAIL Byte Dance AI Lab {shiwenxian,zhouhao.nlp,lileilab}@.bytedance.com
Pseudocode Yes Algorithm 1 Training procedure
Open Source Code No The paper does not provide an explicit statement about releasing source code for the VTM model, nor does it include a link to a code repository.
Open Datasets Yes We perform the experiment on SPNLG (Reed et al., 2018)3 and WIKI (Lebret et al., 2016; Wang et al., 2018b). ... https://nlds.soe.ucsc.edu/sentence-planning-NLG ... https://eaglew.github.io/patents/
Dataset Splits Yes The statistics for the number of table-text pairs and raw texts in the training, validation and test sets are shown in Table 2. ... Table 2: Dataset statistics in our experiments. ... Valid ... SPNLG 20, 495 ... WIKI 72, 831
Hardware Specification Yes We train and test the models on a single Tesla V100 GPU.
Software Dependencies No The paper mentions using 'Adam optimizer' but does not specify versions for programming languages or other software libraries (e.g., Python, PyTorch/TensorFlow) used for implementation.
Experiment Setup Yes Word embeddings are randomly initialized with 300-dimension. During training, we use Adam optimizer (Kingma & Ba, 2015) with the initial learning rate as 0.001. Details on hyperparameters are listed in Appendix D. ... For the model trained on WIKI dataset, the the dimension of latent template variable is set as 100, and the dimension of latent content variable is set as 200. ... For the hyperparameters of total loss Ltot, we set λMI = 0.5, λpt = 1.0 and λpc = 0.5.