CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation

Authors: Junsong Fan, Zhaoxiang Zhang, Tieniu Tan, Chunfeng Song, Jun Xiao10762-10769

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct thorough experiments to demonstrate the effectiveness of the proposed approach. Our approach achieves new state-of-the-art results of weakly supervised semantic segmentation by only using image-level labels, with 64.3% m Io U on Pascal VOC 2012 validation set, and 65.3% on the test set.
Researcher Affiliation Academia 1Center for Research on Intelligent Perception and Computing, CASIA 2National Laboratory of Pattern Recognition, CASIA 3Center for Excellence in Brain Science and Intelligence Technology, CAS 4University of Chinese Academy of Sciences {fanjunsong2016, zhaoxiang.zhang}@ia.ac.cn, {tnt, chunfeng.song}@nlpr.ia.ac.cn, xiaojun@ucas.ac.cn
Pseudocode No The paper describes its method using textual descriptions and mathematical equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Codes are implemented with MXNet (Chen et al. 2015), and are available at: https://github.com/jsfan/CIAN.
Open Datasets Yes We evaluate our proposed method on Pascal VOC 2012 segmentation benchmark (Everingham et al. 2010).
Dataset Splits Yes Following the common practice (Wei et al. 2017a; 2018; Huang et al. 2018), we use the expanded set collected by Hariharan et al. (Hariharan et al. 2011), i.e., there are 10582 training images, 1449 validation images, and 1456 testing images.
Hardware Specification No The paper mentions aspects like computation complexity and uses ResNet101 as a backbone, but does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies No Codes are implemented with MXNet (Chen et al. 2015), and CRF (Kr ahenb uhl and Koltun 2011) is used for post-processing. However, specific version numbers for MXNet or the CRF implementation are not provided.
Experiment Setup Yes We adopt the SGD optimizer with an initial learning rate 5e 4 and momentum 0.9, which is poly-decayed by power 0.9. We use batch size 16 to train 20 epochs with randomly cropped images of size 321. Standard data augmentation, i.e., random cropping, scaling, and horizontal flipping are adopted.