Recurrent Hierarchical Topic-Guided RNN for Language Generation
Authors: Dandan Guo, Bo Chen, Ruiying Lu, Mingyuan Zhou
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on a variety of real-world text corpora demonstrate that the proposed model not only outperforms larger-context RNN-based language models, but also learns interpretable recurrent multilayer topics and generates diverse sentences and paragraphs that are syntactically correct and semantically coherent. |
| Researcher Affiliation | Academia | 1National Laboratory of Radar Signal Processing, Xidian University, Xi an, China. 2Mc Combs School of Business, The University of Texas at Austin, Austin, TX 78712, USA. Correspondence to: Bo Chen <bchen@mail.xidian.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 Hybrid TLASGR-MCMC and recurrent autoencoding variational inference for r GBN-RNN. |
| Open Source Code | Yes | Python (Tensor Flow) code is provided at https://github.com/Dan123dan/r GBN-RNN |
| Open Datasets | Yes | We consider three publicly available corpora, including APNEWS, IMDB, and BNC. The links, preprocessing steps, and summary statistics for them are deferred to Appendix C. |
| Dataset Splits | Yes | The APNEWS corpus we consider here consists of 2246 documents in total, where the first 2000 are used for training, 100 for validation, and the remaining 146 for testing. ... For BNC, we split the corpus into training, validation and test set with a ratio of 0.8:0.1:0.1. |
| Hardware Specification | Yes | without pre-training, we have tried training the GPT-2 directly with the APNEWS corpus on one machine with 4 NVIDIA RTX 2080 Ti GPUs |
| Software Dependencies | No | Python (Tensor Flow) code is provided at https://github.com/Dan123dan/r GBN-RNN. While TensorFlow is mentioned, a specific version number is not provided, nor are version numbers for Python or other libraries. |
| Experiment Setup | Yes | Dropout with a rate of 0.4 is used to the input of the stacked-RNN at each layer... The gradients are clipped if the norm of the parameter vector exceeds 5. We use the Adam optimizer (Kingma & Ba, 2015) with learning rate 10 3. The length of an input sentence is fixed to 30. We set the mini-batch size as 8, number of training epochs as 5, and τ0 as 1. |