Deep AutoRegressive Networks
Authors: Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, Daan Wierstra
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate state-of-the-art generative performance on a number of classic data sets, including several UCI data sets, MNIST and Atari 2600 games. We trained our models on binary UCI data sets, MNIST digits and frames from five Atari 2600 games (Bellemare et al., 2013). |
| Researcher Affiliation | Industry | Karol Gregor KAROLG@GOOGLE.COM Ivo Danihelka DANIHELKA@GOOGLE.COM Andriy Mnih AMNIH@GOOGLE.COM Charles Blundell CBLUNDELL@GOOGLE.COM Daan Wierstra WIERSTRA@GOOGLE.COM Google Deep Mind |
| Pseudocode | No | The paper describes the learning steps in a numbered list within paragraph text, but does not provide a formally labeled 'Pseudocode' or 'Algorithm' block: 'Learning proceeds as follows: 1. Given an observation x, sample a representation h q(H|x) (see Section 2.3). 2. Calculate q(h|x) (Eq. 4), p(x|h) (Eq. 3) and p(h) (Eq. 1) for the sampled representation h and given observation x. 3. Calculate the gradient of Eq. 15. 4. Update the parameters of the autoencoder by following the gradient θL(x).' |
| Open Source Code | No | The paper states: 'The script to generate the dataset is available at https://github.com/fidlej/aledataset'. This link is for a dataset generation script, not the source code for the DARN methodology itself, and no explicit statement regarding the release of the DARN model's source code is made. |
| Open Datasets | Yes | We trained our models on binary UCI data sets, MNIST digits and frames from five Atari 2600 games (Bellemare et al., 2013). We evaluated the test-set performance of DARN on eight binary data sets from the UCI repository (Bache & Lichman, 2013). We evaluated the sampling and test-set performance of DARN on the binarised MNIST data set (Salakhutdinov & Murray, 2008), which consists of 50, 000 training, 10, 000 validation, and 10, 000 testing images of hand-written digits (Larochelle & Murray, 2011). The script to generate the dataset is available at https://github.com/fidlej/aledataset |
| Dataset Splits | Yes | on the binarised MNIST data set (Salakhutdinov & Murray, 2008), which consists of 50, 000 training, 10, 000 validation, and 10, 000 testing images of hand-written digits (Larochelle & Murray, 2011). Where available, we used a validation set to choose the learning rate and certain aspects of the model architecture, such as the number of hidden units. The architecture and learning rate was selected by cross-validation on a validation set for each data set. |
| Hardware Specification | No | The paper mentions that the system is 'easily implementable on graphical processing units', but does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for experiments. |
| Software Dependencies | No | The paper mentions 'RMSprop (Graves, 2013)' and 'Adaptive weight noise (Graves, 2011)', which are algorithms or techniques, not specific software libraries or frameworks with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). No programming languages or specific software dependencies with version numbers are stated. |
| Experiment Setup | Yes | The number of deterministic hidden units was selected from 100 to 500, in steps of 100, whilst the number of stochastic hidden units was selected from {8, 12, 16, 32, 64, 128, 256}. We used RMSprop (Graves, 2013) with momentum 0.9 and learning rates 0.00025, 0.0000675 or 10 5. The network was trained with minibatches of size 100. We used two hidden layers, one deterministic, one stochastic. The deterministic layer had 100 units for architectures with 16 or fewer stochastic units per layer, and 500 units for more than 16 stochastic units. The deterministic activation function was taken to be the tanh function. Training was done with RMSprop (Graves, 2013), momentum 0.9 and minibatches of size 100. We used a learning rate of 3 10 5. |