Head-Free Lightweight Semantic Segmentation with Linear Transformer
Authors: Bo Dong, Pichao Wang, Fan Wang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on widely adopted datasets demonstrate that AFFormer achieves superior accuracy while retaining only 3M parameters. On the ADE20K dataset, AFFormer achieves 41.8 m Io U and 4.6 GFLOPs, which is 4.4 m Io U higher than Segformer, with 45% less GFLOPs. On the Cityscapes dataset, AFFormer achieves 78.7 m Io U and 34.4 GFLOPs, which is 2.5 m Io U higher than Segformer with 72.5% less GFLOPs. |
| Researcher Affiliation | Industry | Alibaba Group {bo.dong.cst, pichaowang}@gmail.com; fan.w@alibaba-inc.com *Work done during an internship at Alibaba Group. Corresponding author; work done at Alibaba Group, and now affiliated with Amazon Prime Video. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/dongbo811/AFFormer. |
| Open Datasets | Yes | We validate the proposed AFFormer on three publicly datasets: ADE20K (Zhou et al. 2017), Cityscapes (Cordts et al. 2016) and COCO-stuff (Caesar, Uijlings, and Ferrari 2018). |
| Dataset Splits | No | The paper specifies training parameters and data augmentation but does not explicitly state the dataset splits (e.g., percentages or sample counts) for training, validation, or testing. |
| Hardware Specification | Yes | The FPS is tested on a V100 NVIDIA GPU with a batch size of 1 on the resolution of 1024x2048. |
| Software Dependencies | No | We implement our AFFormer with the Py Torch framework base on MMSegmentation toolbox (Open MMLab 2020). The paper mentions software but does not provide specific version numbers for PyTorch or MMSegmentation. |
| Experiment Setup | Yes | During semantic segmentation training, we employ the widely used Adam W optimizer for all datasets to update the model parameters. For the ADE20K and Cityscapes datasets, we adopt the default training iterations 160K in Segformer, where minibatchsize is set to 16 and 8, respectively. For the COCO-stuff dataset, we set the training iterations to 80K and the minibatch to 16. In addition, we implement data augmentation during training by random horizontal flipping, random resizing with a ratio of 0.5-2.0, and random cropping. |