General Stochastic Networks for Classification
Authors: Matthias Zöhrer, Franz Pernkopf
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 3 we empirically study the influence of hyper parameters of GSNs and present experimental results. |
| Researcher Affiliation | Academia | Matthias Z ohrer and Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology matthias.zoehrer@tugraz.at, pernkopf@tugraz.at |
| Pseudocode | No | The paper includes diagrams of Markov chains and mathematical equations but no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The code will be made publicly available for reproducing the results. |
| Open Datasets | Yes | In order to evaluate the capabilities of GSNs for supervised learning, we studied MNIST digits [13], variants of MNIST digits [14] and the rectangle datasets [14]. |
| Dataset Splits | Yes | Each variant includes 10.000 labeled training, 2000 labeled validation, and 50.000 labeled test images. |
| Hardware Specification | No | All simulations1 were executed on a GPU with the help of the mathematical expression compiler Theano [31]. The specific model or type of GPU is not mentioned. |
| Software Dependencies | No | All simulations1 were executed on a GPU with the help of the mathematical expression compiler Theano [31]. No specific version number for Theano is provided. |
| Experiment Setup | Yes | In all experiments a three layer GSN, i.e. GSN-3, with 2000 neurons in each layer, randomly initialized with small Gaussian noise, i.e. 0.01 N(0, 1), and an MSE loss function for both inputs and targets was used. Regarding optimization we applied SGD with a learning rate η = 0.1, a momentum term of 0.9 and a multiplicative annealing factor ηn+1 = ηn 0.99 per epoch n for the learning rate. A rectifier unit [23] was chosen as activation function. Walkback training with K = 6 steps using zero-mean preand postactivation Gaussian noise with zero mean and variance σ = 0.1 was performed for 500 training epochs. |