reproducibilityindex.ai

Iterative Neural Autoregressive Distribution Estimator NADE-k

Authors: Tapani Raiko, Yao Li, Kyunghyun Cho, Yoshua Bengio

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the proposed model with two datasets: binarized MNIST handwritten digits and Caltech 101 silhouettes. We report in Table 1 the mean of the test log-probabilities averaged over randomly selected orderings.
Researcher Affiliation	Academia	Tapani Raiko Aalto University Li Yao Universit e de Montr eal Kyung Hyun Cho Universit e de Montr eal Yoshua Bengio Universit e de Montr eal, CIFAR Senior Fellow
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	We have made our implementation available git@github.com:yaoli/nade k.git
Open Datasets	Yes	We study the proposed model with two datasets: binarized MNIST handwritten digits and Caltech 101 silhouettes. ... We closely followed the procedure used by Uria et al. (2014), including the split of the dataset... We also evaluate the proposed NADE-k on Caltech-101 Silhouettes (Marlin et al., 2010), using the standard split...
Dataset Splits	Yes	We closely followed the procedure used by Uria et al. (2014), including the split of the dataset into 50,000 training samples, 10,000 validation samples and 10,000 test samples. ... using the standard split of 4100 training samples, 2264 validation samples and 2307 test samples.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using Ada Delta and Theano but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	We use stochastic gradient descent on the training set with a minibatch size ﬁxed to 100. ... We used a ﬁxed width of 500 units per hidden layer. The number of steps k was selected among {1, 2, 4, 5, 7}. ... Each model was pretrained for 1000 epochs and ﬁne-tuned for 1000 epochs in the case of one hidden layer and 2000 epochs in the case of two. ... The regularization constant was chosen to be 0.00122 for the two-hidden-layer model.