Generative Modeling of Convolutional Neural Networks

Authors: Jifeng Dai, Yang Lu, and Ying-Nian Wu

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS, Table 1: Error rates on the MNIST test set of different training approaches utilizing the Le Net network (Le Cun et al., 1998)., Table 2: Top-1 classification error rates on the Image Net ILSVRC-2012 val set of different training approaches.
Researcher Affiliation Collaboration Jifeng Dai Microsoft Research jifdai@microsoft.com, Yang Lu and Ying Nian Wu University of California, Los Angeles {yanglv, ywu}@stat.ucla.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code can be downloaded at http://www.stat.ucla.edu/ yang.lu/ Project/generative CNN/main.html
Open Datasets Yes Experiments are performed on two commonly used image classification benchmarks: MNIST (Le Cun et al., 1998) handwritten digit recognition and Image Net ILSVRC-2012 (Deng et al., 2009) natural image classification.
Dataset Splits Yes Network training and testing are performed on the train and val sets respectively.
Hardware Specification Yes On a desktop with GTX Titian, it takes about 0.4 minute to draw a sample for nodes at the final fully-connected layer of Le Net.
Software Dependencies No We build algorithms on the code of Caffe (Jia et al., 2014) but no version number for Caffe or other software dependencies is specified.
Experiment Setup Yes For all the three training approaches, stochastic gradient descent is performed in training with a batch size of 64, a base learning rate of 0.01, a weight decay term of 0.0005, a momentum term of 0.9, and a max epoch number of 25. For GG+DG, the pre-training stage stops after 16 epochs and the discriminative gradient tuning stage starts with a base learning rate of 0.003.