Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Authors: Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan Yuille
ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Quantitatively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 71.6% IOU accuracy in the test set. We test our Deep Lab model on the PASCAL VOC 2012 segmentation benchmark (Everingham et al., 2014), consisting of 20 foreground object classes and one background class. |
| Researcher Affiliation | Collaboration | Liang-Chieh Chen Univ. of California, Los Angeles lcchen@cs.ucla.edu George Papandreou Google Inc. gpapan@google.com Iasonas Kokkinos Centrale Sup elec and INRIA iasonas.kokkinos@ecp.fr Kevin Murphy Google Inc. kpmurphy@google.com Alan L. Yuille Univ. of California, Los Angeles yuille@stat.ucla.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | We share our source code, configuration files, and trained models that allow reproducing the results in this paper at a companion web site https://bitbucket.org/deeplab/deeplab-public. |
| Open Datasets | Yes | We test our Deep Lab model on the PASCAL VOC 2012 segmentation benchmark (Everingham et al., 2014), consisting of 20 foreground object classes and one background class. The original dataset contains 1, 464, 1, 449, and 1, 456 images for training, validation, and testing, respectively. The dataset is augmented by the extra annotations provided by Hariharan et al. (2011), resulting in 10, 582 training images. |
| Dataset Splits | Yes | The original dataset contains 1, 464, 1, 449, and 1, 456 images for training, validation, and testing, respectively. and We conduct the majority of our evaluations on the PASCAL val set, training our model on the augmented PASCAL train set. |
| Hardware Specification | Yes | Using our Caffe-based implementation and a Titan GPU, the resulting VGG-derived network is very efficient: Given a 306 306 input image, it produces 39 39 dense raw feature scores at the top of the network at a rate of about 8 frames/sec during testing. |
| Software Dependencies | No | The paper mentions 'Caffe framework (Jia et al., 2014)' but does not specify a version number for Caffe or any other software dependencies. |
| Experiment Setup | Yes | We use a mini-batch of 20 images and initial learning rate of 0.001 (0.01 for the final classifier layer), multiplying the learning rate by 0.1 at every 2000 iterations. We use momentum of 0.9 and a weight decay of 0.0005. and We fix the number of mean field iterations to 10 for all reported experiments. and We use the default values of w2 = 3 and σγ = 3 and we search for the best values of w1, σα, and σβ by cross-validation on a small subset of the validation set (we use 100 images). |