Reinforcement Learning with Euclidean Data Augmentation for State-Based Continuous Control

Authors: Jinzhu Luo, Dingyang Chen, Qi Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments
Researcher Affiliation Academia Jinzhu Luo University of South Carolina jinzhu@email.sc.edu Dingyang Chen University of South Carolina dingyang@email.sc.edu Qi Zhang University of South Carolina qz5@cse.sc.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Our code is available on Git Hub1. 1https://github.com/Jinzhu Luo/Euclidean DA
Open Datasets Yes All the tasks in our experiments are provided by the Deep Mind Control Suite (DMControl) [12] powered by the physics simulator of Mu Jo Co [27]
Dataset Splits No The paper describes evaluation procedures but does not explicitly mention training/test/validation dataset splits.
Hardware Specification Yes The training runs are computed by NVIDIA V100 single-GPU machines, each taking roughly 2 hours to finish 1M training time steps for our method and all the baselines except for the SEGNN baseline, which takes roughly 70 hours to finish 1M steps.
Software Dependencies No The paper mentions 'Mu Jo Co' and specific algorithms/optimizers like 'DDPG' and 'Adam', but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We perform a hyperparameter search over Baug/B =: ρaug {0, 25%, 50%, 75%, 100%} separately for each task, while keeping all other hyperparameters the same as DDPG default. Table 3 presents the full list of DDPG hyperparameters used in our method and baselines with DDPG as the base RL algorithm, including standard DDPG, DDPG + GN, DDPG + RAS, and DDPG + Ours. Examples include: Learning rate 1e-4, Optimizer Adam, Mini-batch size 256.