RLfOLD: Reinforcement Learning from Online Demonstrations in Urban Autonomous Driving

Authors: Daniel Coelho, Miguel Oliveira, Vitor Santos

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on the CARLA No Crash benchmark demonstrate the effectiveness and efficiency of RLf OLD.
Researcher Affiliation Academia Daniel Coelho 1, 2, Miguel Oliveira 1, 2, V ıtor Santos 1, 2 1 Department of Mechanical Engineering, University of Aveiro, 3810-193 Aveiro, Portugal 2 Intelligent System Associate Laboratory (LASI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal {danielsilveiracoelho, mriem, vitor}@ua.pt
Pseudocode Yes Algorithm 1: Reinforcement Learning from Online Demonstrations (RLf OLD)
Open Source Code Yes The source code of RLf OLD is available at https://github.com/Daniel Coelho112/rlfold.
Open Datasets Yes Our experiments on the CARLA No Crash benchmark demonstrate the effectiveness and efficiency of RLf OLD. The environment was built using the CARLA driving simulator (version 0.9.10.1).
Dataset Splits No The paper mentions "The training process involves four distinct weather types, while the testing phase employs two different weather types" and evaluates on the "No Crash benchmark", which has predefined tasks including "train" and "test" environments. However, it does not explicitly provide percentages or counts for a separate validation split nor explicit details about the data splitting methodology for training, validation, and test sets beyond these high-level descriptions of the benchmark tasks.
Hardware Specification Yes All algorithms are trained on the same hardware, specifically a single NVIDIA RTX 2080 Ti.
Software Dependencies No The Deep Learning library used was Py Torch (Paszke et al. 2019). No specific version number is provided for PyTorch or other software dependencies.
Experiment Setup Yes Table 1 contains the main hyperparameters used by RLf OLD. Parameter Value Replay Buffer capacity 100000 Batch size 128 Action repeat 2 Discount factor (γ) 0.85 Optimizer Adam Learning rate 10 3 Target Q-network update rate (ρ) 0.01 dim(i) 256 dim(w) 32 dim(v) 16 SAC networks size 1024 Init entropy parameter (α) 0.2 Uncertainty threshold (u) 0.8