Inherent Weight Normalization in Stochastic Neural Networks

Authors: Georgios Detorakis, Sourav Dutta, Abhishek Khanna, Matthew Jerry, Suman Datta, Emre Neftci

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and event-based classification benchmarks (N-MNIST and DVS Gestures). Our results show that NSMs perform comparably or better than conventional artificial neural networks with the same architecture.
Researcher Affiliation Academia Georgios Detorakis Department of Cognitive Sciences University of California Irvine Irvine, CA 92697 gdetorak@uci.edu Sourav Dutta Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556 USA sdutta4@nd.edu Abhishek Khanna Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556 USA akhanna@nd.edu Matthew Jerry Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556 USA mjerry@alumni.nd.edu Suman Datta Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556 USA sdatta@nd.edu Emre Neftci Department of Cognitive Sciences Department of Computer Science University of California Irvine Irvine, CA 92697 eneftci@uci.edu
Pseudocode No The paper describes equations and procedures but does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes 1https://github.com/nmi-lab/neural_sampling_machines (footnote on page 5)
Open Datasets Yes We demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and event-based classification benchmarks (N-MNIST and DVS Gestures). [26], EMNIST [11], N-MNIST [39], and DVS Gestures data sets (See Methods) using convolutional architecture.
Dataset Splits No The paper mentions standard train/test splits for specific datasets (e.g., '50K/10K images for training and testing respectively' for CIFAR10/100, and '23 subjects are used for the training set, and the remaining 6 subjects are reserved for testing' for DVS Gestures) but does not explicitly provide details for a separate validation split, its size, or how it's used for hyperparameter tuning.
Hardware Specification No The paper mentions 'GPU simulations' in the contribution section but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies No All simulations were performed using Pytorch [40]. No version numbers are provided for Pytorch or any other software.
Experiment Setup Yes The NSM was trained using back-propagation and a softmax layer with cross-entropy loss and minibatches of size 100. We used the Adam [24] optimizer, with initial learning rate 0.0003 and we trained for 200 epochs using a batch size of 100 over the entire CIFAR10/100 data sets. After 100 epochs we started decaying the learning rate linearly and we changed the first moment from 0.9 to 0.5.