High Resolution Feature Recovering for Accelerating Urban Scene Parsing
Authors: Rui Zhang, Sheng Tang, Luoqi Liu, Yongdong Zhang, Jintao Li, Shuicheng Yan
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments In this section, we perform experiments on the two challenging urban scene parsing benchmarks, including Cityscapes dataset and Cam Vid dataset. 4.1 Experimental Settings ... 4.2 Results on Cityscapes Dataset ... Table 1: Accuracy and inference time of the proposed method, evaluated on Cityscapes validation set on the Deep Lab-v2 framework. |
| Researcher Affiliation | Collaboration | 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2 Qihoo 360 Artificial Intelligence Institute, Beijing, China 3 University of Chinese Academy of Sciences, Beijing, China 4 Department of Electrical and Computer Engineering, National University of Singapore, Singapore |
| Pseudocode | No | The paper describes the proposed framework and its modules, but it does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the proposed HRFR method or provide a link to a code repository. |
| Open Datasets | Yes | Extensive experiments on the two challenging Cityscapes and Cam Vid datasets well demonstrate the effectiveness of the proposed HRFR framework... Cityscapes Dataset The Cityscapes dataset is taken by car-carried cameras and collected in street scenes from 50 different cities. It contains 5000 images of 19 semantic classes, 2975 images for training, 500 images for validation and 1525 for testing. Cam Vid Dataset The Cam Vid dataset is collected with images captured from driving videos at daytime and dusk. It contains 701 images with pixel-level annotations on 11 semantic classes. |
| Dataset Splits | Yes | Cityscapes Dataset ... It contains 5000 images of 19 semantic classes, 2975 images for training, 500 images for validation and 1525 for testing. |
| Hardware Specification | Yes | Our experiments are implemented based on the MXNet platform and performed on NVIDIA Tesla K40 GPUs. |
| Software Dependencies | No | The paper mentions using the 'MXNet platform' but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | During training, the entire framework is trained end-to-end by the objective function in Equation (10), where we set the boundary radius r = 5, loss weights λ1=λ2=1, λ3=0.5, γ1=2, γ2=1 empirically. We adopt the standard stochastic gradient descent (SGD) with the mini-batch of 4 samples. The learning rate is maintained at 0.0005 for 60 epochs. We randomly crop samples of 500 500 from images, and apply horizontal flip and random resizing between 0.5 and 1.5. |