Bidirectional Recurrent Neural Networks as Generative Models
Authors: Mathias Berglund, Tapani Raiko, Mikko Honkala, Leo Kärkkäinen, Akos Vetek, Juha T. Karhunen
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on text data show that both proposed methods are much more accurate than unidirectional reconstructions, although a bit less accurate than a computationally complex bidirectional Bayesian inference on the unidirectional RNN. We also provide results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods. |
| Researcher Affiliation | Collaboration | Mathias Berglund Aalto University, Finland Tapani Raiko Aalto University, Finland Mikko Honkala Nokia Labs, Finland Leo K arkk ainen Nokia Labs, Finland Akos Vetek Nokia Labs, Finland Juha Karhunen Aalto University, Finland |
| Pseudocode | No | The paper describes algorithms and methods in text and mathematical formulas but does not include any clearly labeled pseudocode blocks or algorithm figures. |
| Open Source Code | No | The paper states that the software for simulations was based on Theano, but it does not provide any specific link or statement about releasing its own implementation code as open source. |
| Open Datasets | Yes | We use the same data set as Sutskever et al. [26], which consists of 2GB of English text from Wikipedia. ... In the other set of experiments, we use four polyphonic music data sets [8]. |
| Dataset Splits | Yes | As the data sets are small, we select the initial learning rate on a grid of {0.0001, 0.0003, . . . , 0.3, 1} based on the lowest validation set cost. |
| Hardware Specification | No | The paper mentions using '8 weeks of GPU time' but does not specify the model or type of GPU used for the experiments. |
| Software Dependencies | No | The software for the simulations for this paper was based on Theano [3, 7]. This mentions a software framework but does not provide specific version numbers for Theano or any other key software components. |
| Experiment Setup | Yes | We use c = 1000 hidden units in the unidirectional RNN and c = 684 hidden units in the two hidden layers in the BRNNs. ... We use a minibatch size of 40, i.e. each minibatch consists of 40 randomly sampled sequences of length 250. ... The step size is set to 0.25 for all layers in the beginning of training, and it is linearly decayed to zero during training. |