Recurrent Convolutional Neural Networks for Scene Labeling

Authors: Pedro Pinheiro, Ronan Collobert

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested our proposed method on two different fully-labeled datasets: the Stanford Background (Gould et al., 2009) and the SIFT Flow Dataset (Liu et al., 2011). Table 3. Pixel and averaged per class accuracy and computing time of other methods and our proposed approaches on the Stanford Background Dataset.
Researcher Affiliation Academia 1Ecole Polytechnique F ed erale de Lausanne (EPFL), Lausanne, Switzerland 2Idiap Research Institute, Martigny, Switzerland
Pseudocode No The paper describes the system and methods in prose and mathematical equations but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper states, 'All the algorithms and experiments were implemented using Torch7 (Collobert et al., 2012),' but does not provide a link or explicit statement about the availability of their own source code for the described method.
Open Datasets Yes We tested our proposed method on two different fully-labeled datasets: the Stanford Background (Gould et al., 2009) and the SIFT Flow Dataset (Liu et al., 2011).
Dataset Splits Yes As in (Gould et al., 2009), we performed a 5-fold cross-validation with the dataset randomly split into 572 training images and 143 test images in each fold. All hyper-parameters were tuned with a 10% held-out validation data.
Hardware Specification Yes Our algorithms were run on a 4-core Intel i7.
Software Dependencies Yes All the algorithms and experiments were implemented using Torch7 (Collobert et al., 2012).
Experiment Setup Yes In all cases, the learning rate in (6) was equal to 10 4. All hyper-parameters were tuned with a 10% held-out validation data. CNN1 was trained with 133 × 133 input patches. The network was composed of a 6 × 6 convolution with nhu1 output planes, followed by a 8 × 8 pooling layer, a tanh( ) non-linearity, another 3 × 3 convolutional layer with nhu2 output planes, a 2 × 2 pooling layer, a tanh( ) non-linearity, and a final 7 × 7 convolution to produce label scores.