Multi-Knowledge Aggregation and Transfer for Semantic Segmentation
Authors: Yuang Liu, Wei Zhang, Jun Wang1837-1845
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the effectiveness of our proposed approach, we conduct extensive experiments on three segmentation datasets: Pascal VOC, Cityscapes, and Cam Vid, showing that MKAT outperforms the other KD methods. |
| Researcher Affiliation | Academia | East China Normal University, Shanghai, China |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes the methodology in text and mathematical formulations. |
| Open Source Code | No | The paper does not contain any statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Experiments Datasets Pascal VOC 2012. It (Everingham and Winn 2011) contains 20 foreground object classes and an extra background class. Following (Chen et al. 2017a; Zhao et al. 2017), we use the additional annotation provided by (Hariharan et al. 2011), resulting in 10,582 labeled images for training. Cityscapes. Cityscapes (Cordts et al. 2016) is for urban scene understanding and contains 30 classes... It contains 2,975 fine annotation images for training... Cam Vid. Cam Vid (Brostow et al. 2008) is an automotive dataset, containing 367 training and 233 testing images... |
| Dataset Splits | Yes | Cityscapes. Cityscapes (Cordts et al. 2016) is for urban scene understanding and contains 30 classes with only 19 classes used for evaluation. It contains 2,975 fine annotation images for training, 500 for validation, and 1,525 for testing. |
| Hardware Specification | No | The paper states 'Our approach is implemented by PyTorch.' but does not provide any specific details about the hardware used for the experiments (e.g., GPU models, CPU types). |
| Software Dependencies | No | The paper mentions 'Our approach is implemented by PyTorch.' but does not specify the version number for PyTorch or list any other software dependencies with their version numbers. |
| Experiment Setup | Yes | Training setup. Our approach is implemented by Py Torch. We employ Deelp Lab V3 with Res Net101 as a teacher network on Cityscapes, while Deelp Lab V3 with Res Net50 for VOC and Cam Vid. The student networks are default Deelp Lab V3 architecture with compact backbones (i.e., Res Net18 or Mobile Net). All the models are trained alone or with different KD methods by mini-batch stochastic gradient descent (SGD) with the momentum (0.9) and the weight decay (0.0005) for 120 epochs. Following (Chen et al. 2017b), we employ a poly learning rate policy where the initial learning rate is multiplied by (1 iter total iters)0.9 after each iteration. And, the initial learning rate of the backbone and encoders is set to 0.007 which is 0.1 times that of the auxiliary head or classifier. The batch size is set to 8, but 4 when testing on Cityscapes due to the super resolution. For data augmentation, we apply random horizontal flipping and random cropping (crop-size 513/768/540 for VOC/Cityscapes/Cam Vid) during training. The hyperparameters {τ, β} are set to {6.0, 0.5} for Cam Vid and {10.0, 0.5} for VOC/Cityscapes, respectively. And α is chosen from [10,15], default by 10. The channel of the latent knowledge is set to 256 in all the experiments. |