Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

Authors: Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically study the ERF for various deep CNN architectures. We first use artificially constructed CNN models to verify the theoretical results in our analysis. We then present our observations on how the ERF changes during the training of deep CNNs on real datasets.
Researcher Affiliation Academia Wenjie Luo Yujia Li Raquel Urtasun Richard Zemel Department of Computer Science University of Toronto {wenjie, yujiali, urtasun, zemel}@cs.toronto.edu
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' sections or structured code blocks.
Open Source Code No The paper makes no explicit statement about releasing its source code, nor does it provide any links to a code repository.
Open Datasets Yes For the classification task we trained a Res Net with 17 residual blocks on the CIFAR-10 dataset. ... For the semantic segmentation task we used the Cam Vid dataset for urban scene segmentation.
Dataset Splits No For the classification task we trained a Res Net with 17 residual blocks on the CIFAR-10 dataset. At the end of training this network reached a test accuracy of 89%. For the semantic segmentation task we used the Cam Vid dataset for urban scene segmentation. We trained a front-end model [21] which is a purely convolutional network... The paper mentions training and testing on datasets but does not provide specific details on the train/validation/test splits (e.g., percentages, sample counts, or explicit references to standard splits with details).
Hardware Specification No The paper does not provide specific details regarding the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions the use of 'standard neural network tools' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No For all ERF studies, we place a gradient signal of 1 at the center of the output plane and 0 everywhere else, and then back-propagate this gradient through the network to get input gradients. ... For the classification task we trained a Res Net with 17 residual blocks on the CIFAR-10 dataset. ... We propose a new random weight initialization scheme that makes the weights at the center of the convolution kernel to have a smaller scale, and the weights on the outside to be larger... The paper describes some architectural choices and a custom initialization scheme but lacks specific details on hyperparameters such as learning rate, batch size, optimizer used, or number of training epochs, which are crucial for full reproducibility.