Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Authors: Zhang-Wei Hong, Yu-Ming Chen, Hsuan-Kung Yang, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, Yueh-Chuan Chang, Chun-Yi Lee

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our architecture is evaluated in an obstacle avoidance task and a target following task. Experimental results show that our architecture significantly outperforms all of the baseline methods in both virtual and real environments, and demonstrates a faster learning curve than them.
Researcher Affiliation Academia Elsa Lab, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes For the evaluation tasks in the real world, we train Deep Lab-v2 [Chen et al., 2016] on ADE20K [Zhou et al., 2017] and ICNet [Zhao et al., 2017a] on Cityscapes [Cordts et al., 2016] as the indoor and outdoor perception modules, respectively.
Dataset Splits No The paper mentions training and evaluation phases, but does not provide specific dataset split information (percentages, sample counts, or explicit splitting methodology) for training, validation, or test sets.
Hardware Specification Yes The authors thank Lite-On Technology Corporation for the support in researching funding, and NVIDIA Corporation for the donation of the Titan X Pascal GPU used for this research.
Software Dependencies No The paper mentions using 'Unity3D' and 'Adam optimizer' but does not provide specific version numbers for these or any other ancillary software components needed to replicate the experiment.
Experiment Setup Yes We use Adam optimizer [Kingma and Ba, 2015], and set both the learning rate and epsilon to 0.001. Each model is trained for 5M frames, and the training data are collected by 16 worker threads for all experimental settings.