Weakly Supervised Semantic Segmentation for Driving Scenes
Authors: Dongseob Kim, Seungho Lee, Junsuk Choe, Hyunjung Shim
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Notably, the proposed method achieves 51.8% m Io U on the Cityscapes test dataset, showcasing its potential as a strong WSSS baseline on driving scene datasets. Experimental results on Cam Vid and Wild Dash2 demonstrate the effectiveness of our method across diverse datasets, even with smallscale datasets or visually challenging conditions. |
| Researcher Affiliation | Academia | 1 Yonsei University, South Korea 2 Sogang University, South Korea 3 Korea Advanced Institute of Science & Technology, South Korea |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/k0u-id/CARB. |
| Open Datasets | Yes | For performance evaluation, we utilized the well-known Cityscapes (Cordts et al. 2016), Cam Vid (Brostow, Fauqueur, and Cipolla 2009), and Wild Dash2 (Zendel et al. 2022), which are autonomous driving datasets. |
| Dataset Splits | Yes | The Cityscapes dataset consists of 2,975 training, 500 validation, and 1,525 test images with fine annotation. ... The Cam Vid dataset consists of 367 training, 101 validation, and 233 test images... The Wild Dash2 dataset consist of 3,618 training, 638 validation, and 812 test images... |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running its experiments. |
| Software Dependencies | No | The paper mentions "MMSeg (Contributors 2020)" but does not specify version numbers for MMSeg or any other key software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We set the length of one side to 512 and varied the length of the other side between 128 and 512. ... we set a random value between 1.0 and 2.0 as the resize ratio. ... Adaptive region balancing is applied from 16K iteration. ... The proposed method consists of two stages. In the first stage, we warm up the baseline segmentation model with global and local views generated from CLIP masks. In the second stage, we refine the segmentation network utilizing CARB. |