reproducibilityindex.ai

Techniques for Learning Binary Stochastic Feedforward Neural Networks

Authors: Tapani Raiko, Mathias Berglund, Guillaume Alain, and Laurent Dinh

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments conﬁrm that training stochastic networks is difﬁcult and show that the proposed two estimators perform favorably among all the ﬁve known estimators. We propose two experiments as benchmarks for stochastic feedforward networks based on the MNIST handwritten digit dataset (Le Cun et al., 1998) and the Toronto Face Database (Susskind et al., 2010).
Researcher Affiliation	Academia	Tapani Raiko & Mathias Berglund Department of Information and Computer Science Aalto University Espoo, Finland {tapani.raiko,mathias.berglund}@aalto.fi Guillaume Alain & Laurent Dinh Department of Computer Science and Operations Research Universit e de Montr eal Montr eal, Canada guillaume.alain.umontreal@gmail.com, dinhlaur@iro.umontreal.ca
Pseudocode	No	The paper describes estimators and mathematical formulations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for its methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We propose two experiments as benchmarks for stochastic feedforward networks based on the MNIST handwritten digit dataset (Le Cun et al., 1998) and the Toronto Face Database (Susskind et al., 2010).
Dataset Splits	Yes	In the MNIST experiments we used a separate validation set to select the learning rate.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper acknowledges the use of Theano ('Theano (Bastien et al., 2012; Bergstra et al., 2010)') but does not specify its version number or any other software dependencies with their versions, which are required for reproducibility.
Experiment Setup	Yes	In all of the experiments, we used stochastic gradient descent with a mini-batch size of 100 and momentum of 0.9. We used a learning rate schedule where the learning rate increases linearly from zero to maximum during the ﬁrst ﬁve epochs and back to zero during the remaining epochs. The maximum learning rate was chosen among {0.0001, 0.0003, 0.001, . . . , 1}. The models were trained with M {1, 20}, and during test time we always used M = 100. We used a network structure of 392-200-200-392 and 2304-200-200-2304 in the ﬁrst and second problem, respectively.