Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation
Authors: Jiawei Fan, Chao Li, Xiaolong Liu, Meina Song, Anbang Yao
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five mainstream benchmarks with various teacher-student network pairs demonstrate the effectiveness of our approach. Experimental results demonstrate that: (i) Af-DCD exhibits superior performance compared to state-of-the-art methods, on various benchmarks with different teacher-student network pairs; (ii) Af-DCD exhibits even more significant improvements on larger datasets, such as ADE20K, indicating it can enhance student s generalization capability. |
| Researcher Affiliation | Collaboration | Jiawei Fan Intel Labs China jiawei.fan@intel.com Chao Li Intel Labs China chao3.li@intel.com Xiaolong Liu Holo Matic Technology Co. Ltd. liuxiaolong@holomatic.com Meina Song BUPT mnsong@bupt.edu.cn Anbang Yao Intel Labs China anbang.yao@intel.com |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/OSVAI/Af-DCD. |
| Open Datasets | Yes | Five popular semantic segmentation datasets, including Cityscapes [27], Pascal VOC [28], Camvid [29], ADE20K [30] and COCO-Stuff-164K [31], are used in our experiments. |
| Dataset Splits | Yes | Five popular semantic segmentation datasets, including Cityscapes [27], Pascal VOC [28], Camvid [29], ADE20K [30] and COCO-Stuff-164K [31], are used in our experiments. Following general settings [9, 20, 15] in semantic segmentation distillation... |
| Hardware Specification | Yes | The training time is measured on 8 NVIDIA RTX A5000 GPUs with 40000 iterations. |
| Software Dependencies | Yes | we implement our method on both MMSegmentation codebase [35] and CIRKD codebase [9]. |
| Experiment Setup | Yes | In training phase, all models are optimized by SGD with the momentum of 0.9, the initial learning rate of 0.02, and the batch size of 16. The input size is 512 × 1024, 400 × 400, 512 × 1024, 512 × 1024, for experiments on Pascal VOC, Cam Vid, ADE20K and COCO-Stuff-164K, respectively. The input size for experiments on Cityscapes are different in the two codebase, 512 × 1024 in CIRKD codebase and 512 × 512 in MMSegmentation codebase [15]. Our masked reconstruction generator consists of two 3 × 3 convolutional layers with Re LU, following [15]. |