reproducibilityindex.ai

Recurrent Hierarchical Topic-Guided RNN for Language Generation

Authors: Dandan Guo, Bo Chen, Ruiying Lu, Mingyuan Zhou

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on a variety of real-world text corpora demonstrate that the proposed model not only outperforms larger-context RNN-based language models, but also learns interpretable recurrent multilayer topics and generates diverse sentences and paragraphs that are syntactically correct and semantically coherent.
Researcher Affiliation	Academia	1National Laboratory of Radar Signal Processing, Xidian University, Xi an, China. 2Mc Combs School of Business, The University of Texas at Austin, Austin, TX 78712, USA. Correspondence to: Bo Chen <bchen@mail.xidian.edu.cn>.
Pseudocode	Yes	Algorithm 1 Hybrid TLASGR-MCMC and recurrent autoencoding variational inference for r GBN-RNN.
Open Source Code	Yes	Python (Tensor Flow) code is provided at https://github.com/Dan123dan/r GBN-RNN
Open Datasets	Yes	We consider three publicly available corpora, including APNEWS, IMDB, and BNC. The links, preprocessing steps, and summary statistics for them are deferred to Appendix C.
Dataset Splits	Yes	The APNEWS corpus we consider here consists of 2246 documents in total, where the first 2000 are used for training, 100 for validation, and the remaining 146 for testing. ... For BNC, we split the corpus into training, validation and test set with a ratio of 0.8:0.1:0.1.
Hardware Specification	Yes	without pre-training, we have tried training the GPT-2 directly with the APNEWS corpus on one machine with 4 NVIDIA RTX 2080 Ti GPUs
Software Dependencies	No	Python (Tensor Flow) code is provided at https://github.com/Dan123dan/r GBN-RNN. While TensorFlow is mentioned, a specific version number is not provided, nor are version numbers for Python or other libraries.
Experiment Setup	Yes	Dropout with a rate of 0.4 is used to the input of the stacked-RNN at each layer... The gradients are clipped if the norm of the parameter vector exceeds 5. We use the Adam optimizer (Kingma & Ba, 2015) with learning rate 10 3. The length of an input sentence is ﬁxed to 30. We set the mini-batch size as 8, number of training epochs as 5, and τ0 as 1.