Neural Networks with Recurrent Generative Feedback

Authors: Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Tsao, Anima Anandkumar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks. 3 Experiment 3.1 Generative feedback promotes robustness
Researcher Affiliation Collaboration Yujia Huang1 James Gornet1 Sihui Dai1 Zhiding Yu2 Tan Nguyen3 Doris Y. Tsao1 Anima Anandkumar1,2 1California Institute of Technology 2NVIDIA 3Rice University
Pseudocode Yes Algorithm 1: Iterative inference and online update in CNN-F
Open Source Code No The paper does not provide an explicit statement or link for open-source code for its methodology.
Open Datasets Yes We train a CNN-F model with two convolution layers and one fully-connected layer on clean Fashion-MNIST images. ... We train the CNN-F on Fashion-MNIST and CIFAR-10 datasets respectively.
Dataset Splits No The paper mentions Fashion-MNIST and CIFAR-10 datasets and training/testing, but does not provide specific split information (e.g., percentages, sample counts) for training, validation, and test sets. It mentions using 'a validation image from Fashion-MNIST' but not a systematic split.
Hardware Specification No The paper does not specify the hardware (e.g., GPU model, CPU model) used for running the experiments.
Software Dependencies No The paper mentions training on Fashion-MNIST and CIFAR-10, and using Adam optimizer [13] in the Appendix, but does not specify software dependencies with version numbers (e.g., PyTorch, TensorFlow versions).
Experiment Setup Yes For Fashion-MNIST, we train a network with 4 convolution layers and 3 fully-connected layers. We use 2 convolutional layers to encode the image into feature space and reconstruct to that feature space. For CIFAR-10, we use the Wide Res Net architecture [39] with depth 40 and width 2. We reconstruct to the feature space after 5 basic blocks in the first network block. For more detailed hyper-parameter settings, please refer to Appendix B.2. During training, we use PGD-7 to attack the first forward pass of CNN-F to obtain adversarial samples.