FasterSeg: Searching for Faster Real-time Semantic Segmentation

Authors: Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on popular segmentation benchmarks demonstrate the competency of Faster Seg. For example, Faster Seg can run over 30% faster than the closest manually designed competitor on Cityscapes, while maintaining comparable accuracy.
Researcher Affiliation Collaboration Wuyang Chen1 , Xinyu Gong1 , Xianming Liu2, Qian Zhang2, Yuan Li2, Zhangyang Wang1 1Department of Computer Science and Engineering, Texas A&M University 2Horizon Robotics Inc. {wuyang.chen,xy gong,atlaswang}@tamu.edu {xianming.liu,qian01.zhang,yuan.li}@horizon.ai
Pseudocode No No explicit pseudocode or algorithm block found.
Open Source Code Yes Our framework is implemented with Py Torch. The search, training, and latency measurement codes are available at https://github.com/TAMU-VITA/Faster Seg.
Open Datasets Yes We use the Cityscapes (Cordts et al., 2016) as a testbed for both our architecture search and ablation studies. Cam Vid (Brostow et al., 2008). BDD (Yu et al., 2018b).
Dataset Splits Yes Cityscapes (Cordts et al., 2016) ... 2,975 images for training and 500 images for validation. Cam Vid (Brostow et al., 2008) ... 367 for training, 101 for validation and 233 for testing. BDD (Yu et al., 2018b) ... 7,000 images for training and 1,000 for validation.
Hardware Specification Yes In all experiments, we use Nvidia Geforce GTX 1080Ti for benchmarking the computing power.
Software Dependencies Yes We employ the high-performance inference framework Tensor RT v5.1.5 and report the inference speed. All experiments are performed under CUDA 10.0 and CUDNN V7. Our framework is implemented with Py Torch.
Experiment Setup Yes We use 160 × 320 random image crops from half-resolution (512 × 1024) images in the training set. When learning network weights W, we use SGD optimizer with momentum 0.9 and weight decay of 5 × 10−4. We used the exponential learning rate decay of power 0.99. When learning the architecture parameters α, β, andγ, we use Adam optimizer with learning rate 3 × 10−4. The entire architecture search optimization takes about 2 days on one 1080Ti GPU.