Generating Sentences Using a Dynamic Canvas
Authors: Harshil Shah, Bowen Zheng, David Barber
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report test set results on the Book Corpus dataset in table 1. We evaluate the ELBO on the test set by drawing 1,000 samples of the latent vector z per data point. We see that AUTR, both with T = 30 and T = 40, trained with or without dropout, achieves a higher ELBO and lower perplexity than Gen-RNN. |
| Researcher Affiliation | Academia | Harshil Shah University College London Bowen Zheng University College London David Barber University College London and Alan Turing Institute |
| Pseudocode | Yes | Algorithm 1: AUTR generative process |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We train our model on the Book Corpus dataset (Zhu et al. 2015), which is composed of sentences from 11,038 unpublished books. |
| Dataset Splits | No | Of the 53M sentences that meet these criteria, we use 90% for training, and 10% for testing. There is no explicit mention of a separate validation split or how it was handled. |
| Hardware Specification | No | The paper mentions training times ('They take, on average, 0.19 and 0.17 seconds per training iteration...'), but it does not specify any details about the hardware (e.g., CPU, GPU model, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'We implement both models in Python, using the Theano (Theano Development Team 2016) and Lasagne (Dieleman et al. 2015) libraries.' While the libraries are cited, specific version numbers for Python, Theano, or Lasagne are not explicitly provided. |
| Experiment Setup | Yes | For both AUTR and Gen-RNN, we use a 50 dimensional latent representation z and the RNN hidden states have 500 units each. We train both models for 1,000,000 iterations, using Adam (Kingma and Ba 2015) with an initial learning rate of 10-4 and mini-batches of size 200. We multiply the KL divergence term by a constant weight, which we linearly anneal from 0 to 1 over the first 20,000 iterations of training. When training Gen-RNN, we randomly drop out 30% of the words. |