A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

Authors: Mehdi Rezaee, Francis Ferraro

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the performance of our algorithm on the APNEWS, IMDB and BNC datasets that are publicly available. Roughly, there are between 7.7k and 9.8k vocab words in each corpus, with between 15M and 20M training tokens each; Table A1 in the appendix details the statistics of these datasets. We also present experiments demonstrating the performance characteristics of using basic RNN, GRU and LSTM cells in Table 1a. The perplexity values of the baselines and our VRTM across our three heldout evaluation sets are shown in Table 1b.
Researcher Affiliation Academia Mehdi Rezaee Department of Computer Science University of Maryland Baltimore County Baltimore, MD 21250 USA rezaee1@umbc.edu Francis Ferraro Department of Computer Science University of Maryland Baltimore County Baltimore, MD 21250 USA ferraro@umbc.edu
Pseudocode No The paper provides a generative story (Figure 1b) which lists steps, and graphical models (Figure 1a, Figure 2), but it does not contain a formal pseudocode block or algorithm section.
Open Source Code Yes Our code, scripts, and models are available at https://github.com/mmrezaee/VRTM.
Open Datasets Yes Datasets We test the performance of our algorithm on the APNEWS, IMDB and BNC datasets that are publicly available.3 https://github.com/jhlau/topically-driven-language-model
Dataset Splits Yes These are the same datasets including the train, validation and test splits, as used by prior work, where additional details can be found [48].
Hardware Specification No The paper mentions that "Some experiments were conducted on the UMBC HPCF" but does not provide specific hardware details like CPU/GPU models, memory, or number of nodes used.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes For an RNN, we used a single-layer LSTM with 600 units in the hidden layer, set the size of embedding to be 400, and had a fixed/maximum sequence length of 45.