Hierarchical Adaptive Value Estimation for Multi-modal Visual Reinforcement Learning

Authors: Yangru Huang, Peixi Peng, Yifan Zhao, Haoran Xu, Mengyue Geng, Yonghong Tian

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We specifically highlight the potency of our approach within the challenging landscape of autonomous driving, utilizing the CARLA benchmark with neuromorphic event and depth data to demonstrate HAVE s capability and the effectiveness of its distinct components. Our approach achieves state-of-the-art performance on challenging autonomous driving tasks. The results show that our method achieves the highest episode reward and driving distance under all eight weather conditions.
Researcher Affiliation Academia Yangru Huang1, Peixi Peng2,3 , Yifan Zhao1, Haoran Xu3,4, Mengyue Geng1, Yonghong Tian1,2,3 1School of Computer Science, Peking University 2School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University 3Peng Cheng Laboratory 4School of Intelligent Systems Engineering, Sun Yat-sen University yrhuang@stu.pku.edu.cn, {pxpeng, zhaoyf, mygeng, yhtian}@pku.edu.cn,xuhr9@mail2.sysu.edu.cn
Pseudocode No The paper describes the methodology but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The code of our paper can be found at https://github.com/Yara-HYR/HAVE.
Open Datasets Yes To evaluate our approach under realistic and challenging multi-modal environments, we employ the CARLA simulator [8], which is a widely used open-source platform for autonomous driving research.
Dataset Splits No The paper does not specify training/validation/test dataset splits with percentages or absolute counts. It mentions training for a certain number of frames ('All methods are trained for 120k frames across 5 random seeds to report the mean and standard deviation of the rewards.'), but not explicit validation split for data.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No The paper mentions implementing based on SAC [14, 15] and Deep MDP [13], and using the stacking based on time (SBT) [48] representation, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The spatial resolution of the input images is 128 x 128 and the channel numbers of RGB, event and depth frames are 3, 5 and 1, respectively. All methods are trained for 120k frames across 5 random seeds to report the mean and standard deviation of the rewards. We use the single camera view setting on the vehicle's roof with 60-degree views.