SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Authors: Qiang Wan, Zilong Huang, Jiachen Lu, Gang YU, Li Zhang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on semantic segmentation and image classification tasks. First, we describe implementation details and compare results with state of the art. We then conduct a series of ablation studies to validate the design of Sea Former. |
| Researcher Affiliation | Collaboration | Qiang Wan1 , Zilong Huang2, Jiachen Lu1, Gang Yu2, Li Zhang1 1School of Data Science, Fudan University 2Tencent PCG |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and models are made publicly available at https://github.com/fudan-zvg/Sea Former. |
| Open Datasets | Yes | We perform segmentation experiments over ADE20K Zhou et al. (2017), City Scapes Cordts et al. (2016). The mean of intersection over union (m Io U) is set as the evaluation metric. We convert full-precision models to TNN Contributors (2019) and measure latency on an ARM-based device with a single Qualcomm Snapdragon 865 processor. |
| Dataset Splits | Yes | ADE20K dataset covers 150 categories, containing 25K images that are split into 20K/2K/3K for Train, val and test. |
| Hardware Specification | Yes | We convert full-precision models to TNN Contributors (2019) and measure latency on an ARM-based device with a single Qualcomm Snapdragon 865 processor. |
| Software Dependencies | No | The paper mentions using TNN and mmsegmentation frameworks, and the Adam W optimizer, but does not provide specific version numbers for these or other software libraries (e.g., Python, PyTorch) that are typically required for reproducibility. |
| Experiment Setup | Yes | The initial learning rate is 0.0005 and the weight decay is 0.01. A poly learning rate scheduled with factor 1.0 is adopted. During inference, we set the same resize and crop rules as Top Former to ensure fairness. |