Reinforcement Learning with Euclidean Data Augmentation for State-Based Continuous Control
Authors: Jinzhu Luo, Dingyang Chen, Qi Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments |
| Researcher Affiliation | Academia | Jinzhu Luo University of South Carolina jinzhu@email.sc.edu Dingyang Chen University of South Carolina dingyang@email.sc.edu Qi Zhang University of South Carolina qz5@cse.sc.edu |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available on Git Hub1. 1https://github.com/Jinzhu Luo/Euclidean DA |
| Open Datasets | Yes | All the tasks in our experiments are provided by the Deep Mind Control Suite (DMControl) [12] powered by the physics simulator of Mu Jo Co [27] |
| Dataset Splits | No | The paper describes evaluation procedures but does not explicitly mention training/test/validation dataset splits. |
| Hardware Specification | Yes | The training runs are computed by NVIDIA V100 single-GPU machines, each taking roughly 2 hours to finish 1M training time steps for our method and all the baselines except for the SEGNN baseline, which takes roughly 70 hours to finish 1M steps. |
| Software Dependencies | No | The paper mentions 'Mu Jo Co' and specific algorithms/optimizers like 'DDPG' and 'Adam', but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We perform a hyperparameter search over Baug/B =: ρaug {0, 25%, 50%, 75%, 100%} separately for each task, while keeping all other hyperparameters the same as DDPG default. Table 3 presents the full list of DDPG hyperparameters used in our method and baselines with DDPG as the base RL algorithm, including standard DDPG, DDPG + GN, DDPG + RAS, and DDPG + Ours. Examples include: Learning rate 1e-4, Optimizer Adam, Mini-batch size 256. |