High Resolution Feature Recovering for Accelerating Urban Scene Parsing

Authors: Rui Zhang, Sheng Tang, Luoqi Liu, Yongdong Zhang, Jintao Li, Shuicheng Yan

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments In this section, we perform experiments on the two challenging urban scene parsing benchmarks, including Cityscapes dataset and Cam Vid dataset. 4.1 Experimental Settings ... 4.2 Results on Cityscapes Dataset ... Table 1: Accuracy and inference time of the proposed method, evaluated on Cityscapes validation set on the Deep Lab-v2 framework.
Researcher Affiliation Collaboration 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2 Qihoo 360 Artificial Intelligence Institute, Beijing, China 3 University of Chinese Academy of Sciences, Beijing, China 4 Department of Electrical and Computer Engineering, National University of Singapore, Singapore
Pseudocode No The paper describes the proposed framework and its modules, but it does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing the source code for the proposed HRFR method or provide a link to a code repository.
Open Datasets Yes Extensive experiments on the two challenging Cityscapes and Cam Vid datasets well demonstrate the effectiveness of the proposed HRFR framework... Cityscapes Dataset The Cityscapes dataset is taken by car-carried cameras and collected in street scenes from 50 different cities. It contains 5000 images of 19 semantic classes, 2975 images for training, 500 images for validation and 1525 for testing. Cam Vid Dataset The Cam Vid dataset is collected with images captured from driving videos at daytime and dusk. It contains 701 images with pixel-level annotations on 11 semantic classes.
Dataset Splits Yes Cityscapes Dataset ... It contains 5000 images of 19 semantic classes, 2975 images for training, 500 images for validation and 1525 for testing.
Hardware Specification Yes Our experiments are implemented based on the MXNet platform and performed on NVIDIA Tesla K40 GPUs.
Software Dependencies No The paper mentions using the 'MXNet platform' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes During training, the entire framework is trained end-to-end by the objective function in Equation (10), where we set the boundary radius r = 5, loss weights λ1=λ2=1, λ3=0.5, γ1=2, γ2=1 empirically. We adopt the standard stochastic gradient descent (SGD) with the mini-batch of 4 samples. The learning rate is maintained at 0.0005 for 60 epochs. We randomly crop samples of 500 500 from images, and apply horizontal flip and random resizing between 0.5 and 1.5.