Learning Visual Words for Weakly-Supervised Semantic Segmentation
Authors: Lixiang Ru, Bo Du, Chen Wu
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Based on the proposed methods, we conducted experiments on the PASCAL VOC 2012 dataset. Our proposed method achieved 67.2% m Io U on the val set and 67.3% m Io U on the test set, which outperformed recent state-of-the-art methods. |
| Researcher Affiliation | Academia | Lixiang Ru1 , Bo Du1 and Chen Wu2 1National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, School of Computer Science and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China 2LIESMARS, Wuhan University, Wuhan, China |
| Pseudocode | No | The paper describes algorithms and processes in text and diagrams but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using 'the official code released at Git Hub without changing any settings' for IRNet, which is a third-party tool for refinement, but does not provide open-source code for their own proposed method. |
| Open Datasets | Yes | The proposed network is trained and evaluated on the PASCAL VOC 2012 dataset [Everingham et al., 2015]. This dataset includes 21 semantic categories, including 20 foreground classes and the background class. Following the common practice, this dataset is augmented with SBD dataset [Hariharan et al., 2011]. |
| Dataset Splits | Yes | The train and val set of the augmented dataset consist of 10582 and 1449 images, respectively. |
| Hardware Specification | No | The paper mentions using ResNet50 as a backbone but does not specify any hardware details like GPU models, CPU types, or memory used for experiments. |
| Software Dependencies | No | The paper mentions using SGD optimizer, PyTorch (implicitly through [Paszke et al., 2019]), Deep Labv2, and IRNet, but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | The classification network is trained for 6 epochs, with batch size of 16. SGD optimizer is used during training. The initial learning rate is initially set to 0.01 for backbone parameters and 0.1 for the other parameters, The learning rate decays every iteration with a polynomial decay strategy. The number of visual words and the weight factor γ in Eq 7 are respectively set to 256 and 2. |