When Semantic Segmentation Meets Frequency Aliasing
Authors: Linwei Chen, Lin Gu, Ying Fu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate consistent improvements in semantic segmentation and low-light instance segmentation tasks. |
| Researcher Affiliation | Academia | Linwei Chen1, Lin Gu2,3 & Ying Fu1 1Beijing Institute of Technology, Beijing, China 2RIKEN AIP, Tokyo, Japan 3The University of Tokyo, Tokyo, Japan |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at: https: //github.com/Linwei-Chen/Seg-Aliasing. |
| Open Datasets | Yes | For semantic segmentation, we employ widely-used and challenging Cityscapes (Cordts et al., 2016), PASCAL VOC (Everingham et al., 2010), and ADE20K (Zhou et al., 2017) datasets. |
| Dataset Splits | Yes | Cityscapes (Cordts et al., 2016) consists of 5,000 finely annotated images... This dataset is meticulously divided into 2,975 images for the training set, 500 for the validation set, and 1,525 for the testing set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | All UPer Net (Xiao et al., 2018) is trained on the Cityscapes dataset using a crop size of 768 768, a batch size of 8, and a total of 80K iterations. We employ stochastic gradient descent (SGD) with a momentum of 0.9 and a weight decay of 5e-4. The initial learning rate is set at 0.01. During training, we adjust the learning rate using the common poly learning rate policy, which reduces the initial learning rate by multiplying (1 iter max iter)0.9. We apply standard data augmentation techniques, including random horizontal flipping and random resizing within the range of 0.5 to 2. |