Generative Modeling of Convolutional Neural Networks
Authors: Jifeng Dai, Yang Lu, and Ying-Nian Wu
ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS, Table 1: Error rates on the MNIST test set of different training approaches utilizing the Le Net network (Le Cun et al., 1998)., Table 2: Top-1 classification error rates on the Image Net ILSVRC-2012 val set of different training approaches. |
| Researcher Affiliation | Collaboration | Jifeng Dai Microsoft Research jifdai@microsoft.com, Yang Lu and Ying Nian Wu University of California, Los Angeles {yanglv, ywu}@stat.ucla.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code can be downloaded at http://www.stat.ucla.edu/ yang.lu/ Project/generative CNN/main.html |
| Open Datasets | Yes | Experiments are performed on two commonly used image classification benchmarks: MNIST (Le Cun et al., 1998) handwritten digit recognition and Image Net ILSVRC-2012 (Deng et al., 2009) natural image classification. |
| Dataset Splits | Yes | Network training and testing are performed on the train and val sets respectively. |
| Hardware Specification | Yes | On a desktop with GTX Titian, it takes about 0.4 minute to draw a sample for nodes at the final fully-connected layer of Le Net. |
| Software Dependencies | No | We build algorithms on the code of Caffe (Jia et al., 2014) but no version number for Caffe or other software dependencies is specified. |
| Experiment Setup | Yes | For all the three training approaches, stochastic gradient descent is performed in training with a batch size of 64, a base learning rate of 0.01, a weight decay term of 0.0005, a momentum term of 0.9, and a max epoch number of 25. For GG+DG, the pre-training stage stops after 16 epochs and the discriminative gradient tuning stage starts with a base learning rate of 0.003. |