Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization

Authors: Quanyi Li, Zhenghao Peng, Bolei Zhou

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments show that HACO achieves a substantially high sample efficiency in the safe driving benchmark. It can train agents to drive in unseen traffic scenes with a handful of human intervention budget and achieve high safety and generalizability, outperforming both reinforcement learning and imitation learning baselines with a large margin.
Researcher Affiliation Academia Quanyi Li1 , Zhenghao Peng2 , Bolei Zhou3 1Centre for Perceptual and Interactive Intelligence, 2The Chinese University of Hong Kong, 3University of California, Los Angeles
Pseudocode Yes Algorithm 1: The workflow of HACO during training
Open Source Code Yes Code and demo videos are available at: https://decisionforce.github.io/HACO/.
Open Datasets Yes We employ a lightweight driving simulator Meta Drive (Li et al., 2021), which preserves the capacity to evaluate the safety and generalizability in unseen environments. ... Though we mainly describe the setting of Meta Drive in this section, we also experiment on CARLA (Dosovitskiy et al., 2017) simulator in Sec. 4.3.
Dataset Splits No The paper states: 'We split the driving scenes into the training set and test set with 50 different scenes in each set.' However, it does not explicitly mention the existence or details of a validation set split.
Hardware Specification Yes When training the baselines, we host 8 concurrent trials in an Nvidia Ge Force RTX 2080 Ti GPU. Each trial consumes 2 CPUs with 8 parallel rollout workers. The main experiments of HACO is conducted on a local computer with an Nvidia Ge Force RTX 2070 and repeat 3 times.
Software Dependencies No The paper mentions implementing algorithms using RLLib and that the simulator is based on Panda3D and Bullet Engine, but it does not provide specific version numbers for these software components.
Experiment Setup Yes Information about other hyper-parameters is given in the Appendix. Appendix E lists hyper-parameters for HACO (Table 4) and various baselines (Table 5-11), including Discounted Factor γ, Learning Rate, Train Batch Size, etc.