Revisiting Data Augmentation in Deep Reinforcement Learning

Authors: Jianshu Hu, Yunpeng Jiang, Paul Weng

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 EXPERIMENTAL RESULTS In order to validate our theoretical analysis and show the effectiveness of our proposed algorithm, we perform a series of experiments to (1) experimentally validate our propositions, (2) conduct a case study explicitly showing the statistics we analyzed, (3) compare our final proposed algorithm with state-of-the-art baselines (RAD, Dr AC, Dr Q, Dr Qv2, SVEA) to verify its sample efficiency, and evaluate its generalization ability against SVEA, which was specifically-designed for this purpose.
Researcher Affiliation Academia Jianshu Hu, Yunpeng Jiang UM-SJTU Joint Institute Shanghai Jiao Tong University Shanghai, China {hjs1998,jyp9961}@sjtu.edu.cn Paul Weng Data Science Research Center Duke Kunshan University Kunshan, Jiangsu, China paul.weng@duke.edu
Pseudocode Yes Algorithm 1 Data-Augmented Off-policy Actor-Critic Scheme
Open Source Code Yes 1The source code of our method: https://github.com/Jianshu-Hu/drqv2
Open Datasets Yes We evaluate different methods on environments from Deep Mind Control Suite (Tassa et al., 2018)
Dataset Splits No The paper evaluates different methods on Deep Mind Control Suite environments, but does not specify explicit training/validation/test dataset splits with percentages, counts, or references to predefined splits.
Hardware Specification Yes to make the algorithms easier to run on our computing device, equipped with one NVIDIA RTX 3060 GPU and Intel i7-10700 CPU
Software Dependencies No The paper mentions software like PyTorch and specifies optimizers (Adam) but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup Yes Table 6: Hyperparameters used in experiments on DMControl (drq)