Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation
Authors: Xueyi Li, Tianfei Zhou, Jianwu Li, Yi Zhou, Zhaoxiang Zhang1984-1992
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the popular PASCAL VOC 2012 and COCO benchmarks, and our model yields state-of-the-art performance. |
| Researcher Affiliation | Academia | 1 Beijing Key Laboratory of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, China 2 Computer Vision Laboratory, ETH Zurich, Switzerland 3 School of Computer Science and Engineering, Southeast University, China 4 Center for Research on Intelligent Perception and Computing, CASIA, China xueyili@bit.edu.cn tianfei.zhou@vision.ee.ethz.ch ljw@bit.edu.cn |
| Pseudocode | No | The paper describes the methods using mathematical equations and textual explanations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is available at: https://github.com/Lixy1997/Group-WSSS. |
| Open Datasets | Yes | We conduct our experiments on two datasets: PASCAL VOC 2012 (Everingham et al. 2010) and COCO (Lin et al. 2014). Following standard protocol (Huang et al. 2018; Lee et al. 2019; Wang et al. 2020b), extra data from SBD (Hariharan et al. 2011) is also used for training, leading to a total of 10,582 training images. |
| Dataset Splits | Yes | We evaluate our model on the standard validation and test sets, which have 1,449 and 1,456 images, respectively. Following (Wang et al. 2020a), we use the default train/val splits (80k images for training and 40k for validation) in the experiment. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions software components like VGG16, ResNet101, DeepLab-v2, and SGD optimizer, but it does not specify any version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For the classification network, the number of nodes K and message passing steps T in the GNN are separately set to 4 and 3 by default. The input image size is 224 × 224. The entire network is trained using the SGD optimizer with initial learning rates of 1e-3 for the backbone and 1e-2 for the GNN, which are reduced by 0.1 every five epochs. The total number of epochs, momentum and weight decay are set to 15, 0.9, and 5e-4, respectively. The λ in Eq. (12) is empirically set to 0.4 and the d in Eq. (5) is set to 4. |