Bidirectional Recurrent Neural Networks as Generative Models

Authors: Mathias Berglund, Tapani Raiko, Mikko Honkala, Leo Kärkkäinen, Akos Vetek, Juha T. Karhunen

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on text data show that both proposed methods are much more accurate than unidirectional reconstructions, although a bit less accurate than a computationally complex bidirectional Bayesian inference on the unidirectional RNN. We also provide results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods.
Researcher Affiliation Collaboration Mathias Berglund Aalto University, Finland Tapani Raiko Aalto University, Finland Mikko Honkala Nokia Labs, Finland Leo K arkk ainen Nokia Labs, Finland Akos Vetek Nokia Labs, Finland Juha Karhunen Aalto University, Finland
Pseudocode No The paper describes algorithms and methods in text and mathematical formulas but does not include any clearly labeled pseudocode blocks or algorithm figures.
Open Source Code No The paper states that the software for simulations was based on Theano, but it does not provide any specific link or statement about releasing its own implementation code as open source.
Open Datasets Yes We use the same data set as Sutskever et al. [26], which consists of 2GB of English text from Wikipedia. ... In the other set of experiments, we use four polyphonic music data sets [8].
Dataset Splits Yes As the data sets are small, we select the initial learning rate on a grid of {0.0001, 0.0003, . . . , 0.3, 1} based on the lowest validation set cost.
Hardware Specification No The paper mentions using '8 weeks of GPU time' but does not specify the model or type of GPU used for the experiments.
Software Dependencies No The software for the simulations for this paper was based on Theano [3, 7]. This mentions a software framework but does not provide specific version numbers for Theano or any other key software components.
Experiment Setup Yes We use c = 1000 hidden units in the unidirectional RNN and c = 684 hidden units in the two hidden layers in the BRNNs. ... We use a minibatch size of 40, i.e. each minibatch consists of 40 randomly sampled sequences of length 250. ... The step size is set to 0.25 for all layers in the beginning of training, and it is linearly decayed to zero during training.